XGBoost is a powerful and popular implementation of the gradient boosting ensemble algorithm.

An important speciality in configuring XGBoost models is the nomination of loss function that is minimized during the training of the model.

The loss function must be matched to the predictive modeling problem type, in the same way we must segregate towardly loss functions based on problem types with deep learning neural networks.

In this tutorial, you will discover how to configure loss functions for XGBoost ensemble models.

After completing this tutorial, you will know:

• Specifying loss functions used when training XGBoost ensembles is a hair-trigger step, much like neural networks.
• How to configure XGBoost loss functions for binary and multi-class nomenclature tasks.
• How to configure XGBoost loss functions for regression predictive modeling tasks.

Let’s get started.

A Gentle Introduction to XGBoost Loss Functions
Photo by Kevin Rheese, some rights reserved.

## Tutorial Overview

This tutorial is divided into three parts; they are:

1. XGBoost and Loss Functions
2. XGBoost Loss for Classification
3. XGBoost Loss for Regression

## XGBoost and Loss Functions

Extreme Gradient Boosting, or XGBoost for short, is an efficient open-source implementation of the gradient boosting algorithm. As such, XGBoost is an algorithm, an open-source project, and a Python library.

It was initially ripened by Tianqi Chen and was described by Chen and Carlos Guestrin in their 2016 paper titled “XGBoost: A Scalable Tree Boosting System.”

It is planned to be both computationally efficient (e.g. fast to execute) and highly effective, perhaps increasingly constructive than other open-source implementations.

XGBoost supports a range of variegated predictive modeling problems, most notably nomenclature and regression.

XGBoost is trained by minimizing loss of an objective function versus a dataset. As such, the nomination of loss function is a hair-trigger hyperparameter and tied directly to the type of problem stuff solved, much like deep learning neural networks.

The implementation allows the objective function to be specified via the “objective” hyperparameter, and sensible defaults are used that work for most cases.

Nevertheless, there remains some ravages by beginners as to what loss function to use when training XGBoost models.

We will take a closer squint at how to configure the loss function for XGBoost in this tutorial.

Before we get started, let’s get setup.

XGBoost can be installed as a standalone library and an XGBoost model can be ripened using the scikit-learn API.

The first step is to install the XGBoost library if it is not once installed. This can be achieved using the pip python package manager on most platforms; for example:

You can then personize that the XGBoost library was installed correctly and can be used by running the pursuit script.

Running the script will print your version of the XGBoost library you have installed.

Your version should be the same or higher. If not, you must upgrade your version of the XGBoost library.

It is possible that you may have problems with the latest version of the library. It is not your fault.

Sometimes, the most recent version of the library imposes spare requirements or may be less stable.

If you do have errors when trying to run the whilom script, I recommend downgrading to version 1.0.1 (or lower). This can be achieved by specifying the version to install to the pip command, as follows:

If you see a warning message, you can safely ignore it for now. For example, unelevated is an example of a warning message that you may see and can ignore:

If you require explicit instructions for your minutiae environment, see the tutorial:

The XGBoost library has its own custom API, although we will use the method via the scikit-learn wrapper classes: XGBRegressor and XGBClassifier. This will indulge us to use the full suite of tools from the scikit-learn machine learning library to prepare data and evaluate models.

Both models operate the same way and take the same arguments that influence how the visualization trees are created and widow to the ensemble.

For increasingly on how to use the XGBoost API with scikit-learn, see the tutorial:

Next, let’s take a closer squint at how to configure the loss function for XGBoost on nomenclature problems.

## XGBoost Loss for Classification

Classification tasks involve predicting a label or probability for each possible class, given an input sample.

There are two main types of nomenclature tasks with mutually sectional labels: binary nomenclature that has two matriculation labels, and multi-class nomenclature that have increasingly than two matriculation labels.

• Binary Classification: Nomenclature task with two matriculation labels.
• Multi-Class Classification: Nomenclature task with increasingly than two matriculation labels.

For increasingly on the variegated types of nomenclature tasks, see the tutorial:

XGBoost provides loss functions for each of these problem types.

It is typical in machine learning to train a model to predict the probability of matriculation membership for probability tasks and if the task requires well-done matriculation labels to post-process the predicted probabilities (e.g. use argmax).

This tideway is used when training deep learning neural networks for classification, and is moreover recommended when using XGBoost for classification.

The loss function used for predicting probabilities for binary nomenclature problems is “binary:logistic” and the loss function for predicting matriculation probabilities for multi-class problems is “multi:softprob“.

• multi:logistic“: XGBoost loss function for binary classification.
• multi:softprob“: XGBoost loss function for multi-class classification.

These string values can be specified via the “objective” hyperparameter when configuring your XGBClassifier model.

For example, for binary classification:

And, for multi-class classification:

Importantly, if you do not specify the “objective” hyperparameter, the XGBClassifier will automatically segregate one of these loss functions based on the data provided during training.

We can make this touchable with a worked example.

The example unelevated creates a synthetic binary nomenclature dataset, fits an XGBClassifier on the dataset with default hyperparameters, then prints the model objective configuration.

Running the example fits the model on the dataset and prints the loss function configuration.

We can see the model automatically segregate a loss function for binary classification.

Alternately, we can specify the objective and fit the model, confirming the loss function was used.

Running the example fits the model on the dataset and prints the loss function configuration.

We can see the model used to specify a loss function for binary classification.

Let’s repeat this example on a dataset with increasingly than two classes. In this case, three classes.

The well-constructed example is listed below.

Running the example fits the model on the dataset and prints the loss function configuration.

We can see the model automatically chose a loss function for multi-class classification.

Alternately, we can manually specify the loss function and personize it was used to train the model.

Running the example fits the model on the dataset and prints the loss function configuration.

We can see the model used to specify a loss function for multi-class classification.

Finally, there are other loss functions you can use for classification, including: “binary:logitraw” and “binary:hinge” for binary nomenclature and “multi:softmax” for multi-class classification.

You can see a full list here:

Next, let’s take a squint at XGBoost loss functions for regression.

## XGBoost Loss for Regression

Regression refers to predictive modeling problems where a numerical value is predicted given an input sample.

Although predicting a probability sounds like a regression problem (i.e. a probability is a numerical value), it is often not considered a regression type predictive modeling problem.

The XGBoost objective function used when predicting numerical values is the “reg:squarederror” loss function.

• “reg:squarederror”: Loss function for regression predictive modeling problems.

This string value can be specified via the “objective” hyperparameter when configuring your XGBRegressor model.

For example:

Importantly, if you do not specify the “objective” hyperparameter, the XGBRegressor will automatically segregate this objective function for you.

We can make this touchable with a worked example.

The example unelevated creates a synthetic regression dataset, fits an XGBRegressor on the dataset, then prints the model objective configuration.

Running the example fits the model on the dataset and prints the loss function configuration.

We can see the model automatically segregate a loss function for regression.

Alternately, we can specify the objective and fit the model, confirming the loss function was used.

Running the example fits the model on the dataset and prints the loss function configuration.

We can see the model used the specified a loss function for regression.

Finally, there are other loss functions you can use for regression, including: “reg:squaredlogerror“, “reg:logistic“, “reg:pseudohubererror“, “reg:gamma“, and “reg:tweedie“.

You can see a full list here:

This section provides increasingly resources on the topic if you are looking to go deeper.

## Summary

In this tutorial, you discovered how to configure loss functions for XGBoost ensemble models.

Specifically, you learned:

• Specifying loss functions used when training XGBoost ensembles is a hair-trigger step much like neural networks.
• How to configure XGBoost loss functions for binary and multi-class nomenclature tasks.
• How to configure XGBoost loss functions for regression predictive modeling tasks.

Do you have any questions?

## Discover The Algorithm Winning Competitions!

#### Develop Your Own XGBoost Models in Minutes

...with just a few lines of Python

Discover how in my new Ebook:
XGBoost With Python

It covers self-study tutorials like:
Algorithm Fundamentals, Scaling, Hyperparameters, and much more...