Some prediction problems require predicting both numeric values and a matriculation label for the same input.

A simple tideway is to develop both regression and nomenclature predictive models on the same data and use the models sequentially.

An volitional and often increasingly constructive tideway is to develop a each neural network model that can predict both a numeric and matriculation label value from the same input. This is tabbed a **multi-output model** and can be relatively easy to develop and evaluate using modern deep learning libraries such as Keras and TensorFlow.

In this tutorial, you will discover how to develop a neural network for combined regression and nomenclature predictions.

After completing this tutorial, you will know:

- Some prediction problems require predicting both numeric and matriculation label values for each input example.
- How to develop separate regression and nomenclature models for problems that require multiple outputs.
- How to develop and evaluate a neural network model capable of making simultaneous regression and nomenclature predictions.

Let’s get started.

## Tutorial Overview

This tutorial is divided into three parts; they are:

- Single Model for Regression and Classification
- Separate Regression and Nomenclature Models
- Abalone Dataset
- Regression Model
- Classification Model

- Combined Regression and Nomenclature Models

## Single Model for Regression and Classification

It is worldwide to develop a deep learning neural network model for a regression or nomenclature problem, but on some predictive modeling tasks, we may want to develop a each model that can make both regression and nomenclature predictions.

Regression refers to predictive modeling problems that involve predicting a numeric value given an input.

Classification refers to predictive modeling problems that involve predicting a matriculation label or probability of matriculation labels for a given input.

For increasingly on the difference between nomenclature and regression, see the tutorial:

There may be some problems where we want to predict both a numerical value and a nomenclature value.

One tideway to solving this problem is to develop a separate model for each prediction that is required.

The problem with this tideway is that the predictions made by the separate models may diverge.

An unorganized tideway that can be used when using neural network models is to develop a each model capable of making separate predictions for a numeric and matriculation output for the same input.

This is tabbed a multi-output neural network model.

The goody of this type of model is that we have a each model to develop and maintain instead of two models and that training and updating the model on both output types at the same time may offer increasingly consistency in the predictions between the two output types.

We will develop a multi-output neural network model capable of making regression and nomenclature predictions at the same time.

First, let’s select a dataset where this requirement makes sense and start by developing separate models for both regression and nomenclature predictions.

## Separate Regression and Nomenclature Models

In this section, we will start by selecting a real dataset where we may want regression and nomenclature predictions at the same time, then develop separate models for each type of prediction.

### Abalone Dataset

We will use the “*abalone*” dataset.

Determining the age of an abalone is a time-consuming task and it is desirable to determine the age from physical details alone.

This is a dataset that describes the physical details of abalone and requires predicting the number of rings of the abalone, which is a proxy for the age of the creature.

You can learn increasingly well-nigh the dataset from here:

The “*age*” can be predicted as both a numerical value (in years) or a matriculation label (ordinal year as a class).

No need to download the dataset as we will download it automatically as part of the worked examples.

The dataset provides an example of a dataset where we may want both a numerical and nomenclature of an input.

First, let’s develop an example to download and summarize the dataset.

1 2 3 4 5 6 7 8 9 10 |
# load and summarize the abalone dataset from pandas import read_csv from matplotlib import pyplot # load dataset url = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/abalone.csv’ dataframe = read_csv(url, header=None) # summarize shape print(dataframe.shape) # summarize first few lines print(dataframe.head()) |

Running the example first downloads and summarizes the shape of the dataset.

We can see that there are 4,177 examples (rows) that we can use to train and evaluate a model and 9 features (columns) including the target variable.

We can see that all input variables are numeric except the first, which is a string value.

To alimony data preparation simple, we will waif the first post from our models and focus on modeling the numeric input values.

(4177, 9) 0 1 2 3 4 5 6 7 8 0 M 0.455 0.365 0.095 0.5140 0.2245 0.1010 0.150 15 1 M 0.350 0.265 0.090 0.2255 0.0995 0.0485 0.070 7 2 F 0.530 0.420 0.135 0.6770 0.2565 0.1415 0.210 9 3 M 0.440 0.365 0.125 0.5160 0.2155 0.1140 0.155 10 4 I 0.330 0.255 0.080 0.2050 0.0895 0.0395 0.055 7 |

We can use the data as the understructure for developing separate regression and nomenclature Multilayer Perceptron (MLP) neural network models.

**Note**: we are not trying to develop an optimal model for this dataset; instead we are demonstrating a explicit technique: developing a model that can make both regression and nomenclature predictions.

### Regression Model

In this section, we will develop a regression MLP model for the abalone dataset.

First, we must separate the columns into input and output elements and waif the first post that contains string values.

We will moreover gravity all loaded columns to have a bladder type (expected by neural network models) and record the number of input features, which will need to be known by the model later.

... # split into input (X) and output (y) variables X, y = dataset[:, 1:–1], dataset[:, –1] X, y = X.astype(‘float’), y.astype(‘float’) n_features = X.shape[1] |

Next, we can split the dataset into a train and test dataset.

We will use a 67% random sample to train the model and the remaining 33% to evaluate the model.

... # split data into train and test sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1) |

We can then pinpoint an MLP neural network model.

The model will have two subconscious layers, the first with 20 nodes and the second with 10 nodes, both using ReLU vivification and “*he normal*” weight initialization (a good practice). The number of layers and nodes were chosen arbitrarily.

The output layer will have a each node for predicting a numeric value and a linear vivification function.

... # pinpoint the keras model model = Sequential() model.add(Dense(20, input_dim=n_features, activation=‘relu’, kernel_initializer=‘he_normal’)) model.add(Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’)) model.add(Dense(1, activation=‘linear’)) |

The model will be trained to minimize the midpoint squared error (MSE) loss function using the constructive Adam version of stochastic gradient descent.

... # compile the keras model model.compile(loss=‘mse’, optimizer=‘adam’) |

We will train the model for 150 epochs with a mini-batch size of 32 samples, then chosen arbitrarily.

... # fit the keras model on the dataset model.fit(X_train, y_train, epochs=150, batch_size=32, verbose=2) |

Finally, without the model is trained, we will evaluate it on the holdout test dataset and report the midpoint wool error (MAE).

... # evaluate on test set yhat = model.predict(X_test) error = mean_absolute_error(y_test, yhat) print(‘MAE: %.3f’ % error) |

Tying this all together, the well-constructed example of an MLP neural network for the abalone dataset framed as a regression problem is listed below.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
# regression mlp model for the abalone dataset from pandas import read_csv from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense from sklearn.metrics import mean_absolute_error from sklearn.model_selection import train_test_split # load dataset url = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/abalone.csv’ dataframe = read_csv(url, header=None) dataset = dataframe.values # split into input (X) and output (y) variables X, y = dataset[:, 1:–1], dataset[:, –1] X, y = X.astype(‘float’), y.astype(‘float’) n_features = X.shape[1] # split data into train and test sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1) # pinpoint the keras model model = Sequential() model.add(Dense(20, input_dim=n_features, activation=‘relu’, kernel_initializer=‘he_normal’)) model.add(Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’)) model.add(Dense(1, activation=‘linear’)) # compile the keras model model.compile(loss=‘mse’, optimizer=‘adam’) # fit the keras model on the dataset model.fit(X_train, y_train, epochs=150, batch_size=32, verbose=2) # evaluate on test set yhat = model.predict(X_test) error = mean_absolute_error(y_test, yhat) print(‘MAE: %.3f’ % error) |

Running the example will prepare the dataset, fit the model, and report an estimate of model error.

**Note**: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the stereotype outcome.

In this case, we can see that the model achieved an error of well-nigh 1.5 (rings).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
… Epoch 145/150 88/88 – 0s – loss: 4.6130 Epoch 146/150 88/88 – 0s – loss: 4.6182 Epoch 147/150 88/88 – 0s – loss: 4.6277 Epoch 148/150 88/88 – 0s – loss: 4.6437 Epoch 149/150 88/88 – 0s – loss: 4.6166 Epoch 150/150 88/88 – 0s – loss: 4.6132 MAE: 1.554 |

So far so good.

Next, let’s squint at developing a similar model for classification.

### Classification Model

The abalone dataset can be framed as a nomenclature problem where each “*ring*” integer is taken as a separate matriculation label.

The example and model are much the same as the whilom example for regression, with a few important changes.

This requires first assigning a separate integer for each “*ring*” value, starting at 0 and ending at the total number of “*classes*” minus one.

This can be achieved using the LabelEncoder.

We can moreover record the total number of classes as the total number of unique encoded matriculation values, which will be needed by the model later.

... # encode strings to integer y = LabelEncoder().fit_transform(y) n_class = len(unique(y)) |

After splitting the data into train and test sets as before, we can pinpoint the model and transpiration the number of outputs from the model to equal the number of classes and use the softmax vivification function, worldwide for multi-class classification.

... # pinpoint the keras model model = Sequential() model.add(Dense(20, input_dim=n_features, activation=‘relu’, kernel_initializer=‘he_normal’)) model.add(Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’)) model.add(Dense(n_class, activation=‘softmax’)) |

Given we have encoded matriculation labels as integer values, we can fit the model by minimizing the sparse well-defined cross-entropy loss function, towardly for multi-class nomenclature tasks with integer encoded matriculation labels.

... # compile the keras model model.compile(loss=‘sparse_categorical_crossentropy’, optimizer=‘adam’) |

After the model is fit on the training dataset as before, we can evaluate the performance of the model by gingerly the nomenclature verism on the hold-out test set.

... # evaluate on test set yhat = model.predict(X_test) yhat = argmax(yhat, axis=–1).astype(‘int’) acc = accuracy_score(y_test, yhat) print(‘Accuracy: %.3f’ % acc) |

Tying this all together, the well-constructed example of an MLP neural network for the abalone dataset framed as a nomenclature problem is listed below.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
# nomenclature mlp model for the abalone dataset from numpy import unique from numpy import argmax from pandas import read_csv from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense from sklearn.metrics import accuracy_score from sklearn.model_selection import train_test_split from sklearn.preprocessing import LabelEncoder # load dataset url = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/abalone.csv’ dataframe = read_csv(url, header=None) dataset = dataframe.values # split into input (X) and output (y) variables X, y = dataset[:, 1:–1], dataset[:, –1] X, y = X.astype(‘float’), y.astype(‘float’) n_features = X.shape[1] # encode strings to integer y = LabelEncoder().fit_transform(y) n_class = len(unique(y)) # split data into train and test sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1) # pinpoint the keras model model = Sequential() model.add(Dense(20, input_dim=n_features, activation=‘relu’, kernel_initializer=‘he_normal’)) model.add(Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’)) model.add(Dense(n_class, activation=‘softmax’)) # compile the keras model model.compile(loss=‘sparse_categorical_crossentropy’, optimizer=‘adam’) # fit the keras model on the dataset model.fit(X_train, y_train, epochs=150, batch_size=32, verbose=2) # evaluate on test set yhat = model.predict(X_test) yhat = argmax(yhat, axis=–1).astype(‘int’) acc = accuracy_score(y_test, yhat) print(‘Accuracy: %.3f’ % acc) |

Running the example will prepare the dataset, fit the model, and report an estimate of model error.

**Note**: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the stereotype outcome.

In this case, we can see that the model achieved an verism of well-nigh 27%.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
… Epoch 145/150 88/88 – 0s – loss: 1.9271 Epoch 146/150 88/88 – 0s – loss: 1.9265 Epoch 147/150 88/88 – 0s – loss: 1.9265 Epoch 148/150 88/88 – 0s – loss: 1.9271 Epoch 149/150 88/88 – 0s – loss: 1.9262 Epoch 150/150 88/88 – 0s – loss: 1.9260 Accuracy: 0.274 |

So far so good.

Next, let’s squint at developing a combined model capable of both regression and nomenclature predictions.

## Combined Regression and Nomenclature Models

In this section, we can develop a each MLP neural network model that can make both regression and nomenclature predictions for a each input.

This is tabbed a multi-output model and can be ripened using the functional Keras API.

For increasingly on this functional API, which can be tricky for beginners, see the tutorials:

First, the dataset must be prepared.

We can prepare the dataset as we did surpassing for classification, although we should save the encoded target variable with a separate name to differentiate it from the raw target variable values.

... # encode strings to integer y_class = LabelEncoder().fit_transform(y) n_class = len(unique(y_class)) |

We can then split the input, raw output, and encoded output variables into train and test sets.

... # split data into train and test sets X_train, X_test, y_train, y_test, y_train_class, y_test_class = train_test_split(X, y, y_class, test_size=0.33, random_state=1) |

Next, we can pinpoint the model using the functional API.

The model takes the same number of inputs as surpassing with the standalone models and uses two subconscious layers configured in the same way.

... # input visible = Input(shape=(n_features,)) hidden1 = Dense(20, activation=‘relu’, kernel_initializer=‘he_normal’)(visible) hidden2 = Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’)(hidden1) |

We can then pinpoint two separate output layers that connect to the second subconscious layer of the model.

The first is a regression output layer that has a each node and a linear vivification function.

... # regression output out_reg = Dense(1, activation=‘linear’)(hidden2) |

The second is a nomenclature output layer that has one node for each matriculation stuff predicted and uses a softmax vivification function.

... # nomenclature output out_clas = Dense(n_class, activation=‘softmax’)(hidden2) |

We can then pinpoint the model with a each input layer and two output layers.

... # pinpoint model model = Model(inputs=visible, outputs=[out_reg, out_clas]) |

Given the two output layers, we can compile the model with two loss functions, midpoint squared error loss for the first (regression) output layer and sparse well-defined cross-entropy for the second (classification) output layer.

... # compile the keras model model.compile(loss=[‘mse’,‘sparse_categorical_crossentropy’], optimizer=‘adam’) |

We can moreover create a plot of the model for reference.

This requires that pydot and pygraphviz are installed. If this is a problem, you can scuttlebutt out this line and the import statement for the *plot_model()* function.

... # plot graph of model plot_model(model, to_file=‘model.png’, show_shapes=True) |

Each time the model makes a prediction, it will predict two values.

Similarly, when training the model, it will need one target variable per sample for each output.

As such, we can train the model, thoughtfully providing both the regression target and nomenclature target data to each output of the model.

... # fit the keras model on the dataset model.fit(X_train, [y_train,y_train_class], epochs=150, batch_size=32, verbose=2) |

The fit model can then make a regression and nomenclature prediction for each example in the hold-out test set.

... # make predictions on test set yhat1, yhat2 = model.predict(X_test) |

The first variety can be used to evaluate the regression predictions via midpoint wool error.

... # summate error for regression model error = mean_absolute_error(y_test, yhat1) print(‘MAE: %.3f’ % error) |

The second variety can be used to evaluate the nomenclature predictions via nomenclature accuracy.

... # evaluate verism for nomenclature model yhat2 = argmax(yhat2, axis=–1).astype(‘int’) acc = accuracy_score(y_test_class, yhat2) print(‘Accuracy: %.3f’ % acc) |

And that’s it.

Tying this together, the well-constructed example of training and evaluating a multi-output model for combiner regression and nomenclature predictions on the abalone dataset is listed below.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
# mlp for combined regression and nomenclature predictions on the abalone dataset from numpy import unique from numpy import argmax from pandas import read_csv from sklearn.metrics import mean_absolute_error from sklearn.metrics import accuracy_score from sklearn.model_selection import train_test_split from sklearn.preprocessing import LabelEncoder from tensorflow.keras.models import Model from tensorflow.keras.layers import Input from tensorflow.keras.layers import Dense from tensorflow.keras.utils import plot_model # load dataset url = ‘https://raw.githubusercontent.com/jbrownlee/Datasets/master/abalone.csv’ dataframe = read_csv(url, header=None) dataset = dataframe.values # split into input (X) and output (y) variables X, y = dataset[:, 1:–1], dataset[:, –1] X, y = X.astype(‘float’), y.astype(‘float’) n_features = X.shape[1] # encode strings to integer y_class = LabelEncoder().fit_transform(y) n_class = len(unique(y_class)) # split data into train and test sets X_train, X_test, y_train, y_test, y_train_class, y_test_class = train_test_split(X, y, y_class, test_size=0.33, random_state=1) # input visible = Input(shape=(n_features,)) hidden1 = Dense(20, activation=‘relu’, kernel_initializer=‘he_normal’)(visible) hidden2 = Dense(10, activation=‘relu’, kernel_initializer=‘he_normal’)(hidden1) # regression output out_reg = Dense(1, activation=‘linear’)(hidden2) # nomenclature output out_clas = Dense(n_class, activation=‘softmax’)(hidden2) # pinpoint model model = Model(inputs=visible, outputs=[out_reg, out_clas]) # compile the keras model model.compile(loss=[‘mse’,‘sparse_categorical_crossentropy’], optimizer=‘adam’) # plot graph of model plot_model(model, to_file=‘model.png’, show_shapes=True) # fit the keras model on the dataset model.fit(X_train, [y_train,y_train_class], epochs=150, batch_size=32, verbose=2) # make predictions on test set yhat1, yhat2 = model.predict(X_test) # summate error for regression model error = mean_absolute_error(y_test, yhat1) print(‘MAE: %.3f’ % error) # evaluate verism for nomenclature model yhat2 = argmax(yhat2, axis=–1).astype(‘int’) acc = accuracy_score(y_test_class, yhat2) print(‘Accuracy: %.3f’ % acc) |

Running the example will prepare the dataset, fit the model, and report an estimate of model error.

**Note**: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the stereotype outcome.

A plot of the multi-output model is created, unmistakably showing the regression (left) and nomenclature (right) output layers unfluctuating to the second subconscious layer of the model.

In this case, we can see that the model achieved both a reasonable error of well-nigh 1.495 (rings) and a similar verism as surpassing of well-nigh 25.6%.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
… Epoch 145/150 88/88 – 0s – loss: 6.5707 – dense_2_loss: 4.5396 – dense_3_loss: 2.0311 Epoch 146/150 88/88 – 0s – loss: 6.5753 – dense_2_loss: 4.5466 – dense_3_loss: 2.0287 Epoch 147/150 88/88 – 0s – loss: 6.5970 – dense_2_loss: 4.5723 – dense_3_loss: 2.0247 Epoch 148/150 88/88 – 0s – loss: 6.5640 – dense_2_loss: 4.5389 – dense_3_loss: 2.0251 Epoch 149/150 88/88 – 0s – loss: 6.6053 – dense_2_loss: 4.5827 – dense_3_loss: 2.0226 Epoch 150/150 88/88 – 0s – loss: 6.5754 – dense_2_loss: 4.5524 – dense_3_loss: 2.0230 MAE: 1.495 Accuracy: 0.256 |

## Further Reading

This section provides increasingly resources on the topic if you are looking to go deeper.

### Tutorials

## Summary

In this tutorial, you discovered how to develop a neural network for combined regression and nomenclature predictions.

Specifically, you learned:

- Some prediction problems require predicting both numeric and matriculation label values for each input example.
- How to develop separate regression and nomenclature models for problems that require multiple outputs.
- How to develop and evaluate a neural network model capable of making simultaneous regression and nomenclature predictions.

**Do you have any questions?**

Ask your questions in the comments unelevated and I will do my weightier to answer.

Comments are closed.