Last Updated on April 27, 2021

Dynamic ensemble selection is an ensemble learning technique that automatically selects a subset of ensemble members just-in-time when making a prediction.

The technique involves fitting multiple machine learning models on the training dataset, then selecting the models that are expected to perform weightier when making a prediction for a explicit new example, based on the details of the example to be predicted.

This can be achieved using a k-nearest neighbor model to locate examples in the training dataset that are closest to the new example to be predicted, evaluating all models in the pool on this neighborhood and using the models that perform the weightier on the neighborhood to make a prediction for the new example.

As such, the dynamic ensemble selection can often perform largest than any each model in the pool and largest than averaging all members of the pool, so-called static ensemble selection.

In this tutorial, you will discover how to develop dynamic ensemble selection models in Python.

After completing this tutorial, you will know:

  • Dynamic ensemble selection algorithms automatically segregate ensemble members when making a prediction on new data.
  • How to develop and evaluate dynamic ensemble selection models for nomenclature tasks using the scikit-learn API.
  • How to explore the effect of dynamic ensemble selection model hyperparameters on nomenclature accuracy.

Kick-start your project with my new typesetting Ensemble Learning Algorithms With Python, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

Dynamic Ensemble Selection (DES) in Python

Dynamic Ensemble Selection (DES) in Python
Photo by Simon Harrod, some rights reserved.

Tutorial Overview

This tutorial is divided into three parts; they are:

  1. Dynamic Ensemble Selection
  2. k-Nearest Neighbor Oracle (KNORA) With Scikit-Learn
    1. KNORA-Eliminate (KNORA-E)
    2. KNORA-Union (KNORA-U)
  3. Hyperparameter Tuning for KNORA
    1. Explore k in k-Nearest Neighbor
    2. Explore Algorithms for Classifier Pool

Dynamic Ensemble Selection

Multiple Classifier Systems refer to a field of machine learning algorithms that use multiple models to write nomenclature predictive modeling problems.

The first matriculation of multiple classifier systems to find success is referred to as Dynamic Classifier Selection, or DCS for short.

  • Dynamic Classifier Selection: Algorithms that dynamically segregate one from among many trained models to make a prediction based on the explicit details of the input.

Dynamic Classifier Selection algorithms often involve partitioning the input full-length space in some way and assigning explicit models to be responsible for making predictions for each partition. There are a variety of variegated DCS algorithms and research efforts are mainly focused on how to evaluate and assign classifiers to explicit regions of the input space.

After training multiple individual learners, DCS dynamically selects one learner for each test instance. […] DCS makes predictions by using one individual learner.

— Page 93, Ensemble Methods: Foundations and Algorithms, 2012.

A natural extension to DCS is algorithms that select one or increasingly models dynamically in order to make a prediction. That is, selecting a subset or ensemble of classifiers dynamically. These techniques are referred to as dynamic ensemble selection, or DES.

  • Dynamic Ensemble Selection: Algorithms that dynamically segregate a subset of trained models to make a prediction based on the explicit details of the input.

Dynamic Ensemble Selection algorithms operate much like DCS algorithms, except predictions are made using votes from multiple classifier models instead of a each weightier model. In effect, each region of the input full-length space is owned by a subset of models that perform weightier in that region.

… given the fact that selecting only one classifier can be highly error-prone, some researchers decided to select a subset of the pool of classifiers rather than just a each wiring classifier. All wiring classifiers that obtained a unrepealable competence level are used to etch the EoC, and their outputs are aggregated to predict the label …

— Dynamic Classifier Selection: Recent Advances And Perspectives, 2018.

Perhaps the canonical tideway to dynamic ensemble selection is the k-Nearest Neighbor Oracle, or KNORA, algorithm as it is a natural extension of the canonical dynamic classifier selection algorithm “Dynamic Classifier Selection Local Accuracy,” or DCS-LA.

DCS-LA involves selecting the k-nearest neighbors from the training or validation dataset for a given new input pattern, then selecting the each weightier classifier based on its performance in that neighborhood of k examples to make a prediction on the new example.

KNORA was described by Albert Ko, et al. in their 2008 paper titled “From Dynamic Classifier Selection To Dynamic Ensemble Selection.” It is an extension of DCS-LA that selects multiple models that perform well on the neighborhood and whose predictions are then combined using majority voting to make a final output prediction.

For any test data point, KNORA simply finds its nearest K neighbors in the validation set, figures out which classifiers correctly classify those neighbors in the validation set and uses them as the ensemble for classifying the given pattern in that test set.

— From Dynamic Classifier Selection To Dynamic Ensemble Selection, 2008.

The selected classifier models are referred to as “oracles“, hence the use of oracle in the name of the method.

The ensemble is considered dynamic considering the members are chosen just-in-time provisionary on the explicit input pattern requiring a prediction. This is opposed to static, where ensemble members are chosen once, such as averaging predictions from all classifiers in the model.

This is washed-up through a dynamic fashion, since variegated patterns might require variegated ensembles of classifiers. Thus, we undeniability our method a dynamic ensemble selection.

— From Dynamic Classifier Selection To Dynamic Ensemble Selection, 2008.

Two versions of KNORA are described, including KNORA-Eliminate and KNORA-Union.

  • KNORA-Eliminate (KNORA-E): Ensemble of classifiers that achieves perfect verism on the neighborhood of the new example, with a reducing neighborhood size until at least one perfect classifier is located.
  • KNORA-Union (KNORA-U): Ensemble of all classifiers that makes at least one correct prediction on the neighborhood with weighted voting and votes proportional to verism on the neighborhood.

KNORA-Eliminate, or KNORA-E for short, involves selecting all classifiers that unzip perfect predictions on the neighborhood of k examples in the neighborhood. If no classifier achieves 100 percent accuracy, the neighborhood size is reduced by one and the models are re-evaluated. This process is repeated until one or increasingly models are discovered that has perfect performance, and then used to make a prediction for the new example.

In the specimen where no classifier can correctly classify all the K-nearest neighbors of the test pattern, then we simply subtract the value of K until at least one classifier correctly classifies its neighbors

— From Dynamic Classifier Selection To Dynamic Ensemble Selection, 2008.

KNORA-Union, or KNORA-U for short, involves selecting all classifiers that make at least one correct prediction in the neighborhood. The predictions from each classifier are then combined using a weighted average, where the number of correct predictions in the neighborhood indicates the number of votes prescribed to each classifier.

The increasingly neighbors a classifier classifies correctly, the increasingly votes this classifier will have for a test pattern

— From Dynamic Classifier Selection To Dynamic Ensemble Selection, 2008.

Now that we are familiar with DES and the KNORA algorithm, let’s squint at how we can use it on our own nomenclature predictive modeling projects.

Want to Get Started With Ensemble Learning?

Take my self-ruling 7-day email crash undertow now (with sample code).

Click to sign-up and moreover get a self-ruling PDF Ebook version of the course.

Download Your FREE Mini-Course

k-Nearest Neighbor Oracle (KNORA) With Scikit-Learn

The Dynamic Ensemble Library, or DESlib for short, is a Python machine learning library that provides an implementation of many variegated dynamic classifiers and dynamic ensemble selection algorithms.

DESlib is an easy-to-use ensemble learning library focused on the implementation of the state-of-the-art techniques for dynamic classifier and ensemble selection.

First, we can install the DESlib library using the pip package manager, if it is not once installed.

Once installed, we can then trammels that the library was installed correctly and is ready to be used by loading the library and printing the installed version.

Running the script will print your version of the DESlib library you have installed.

Your version should be the same or higher. If not, you must upgrade your version of the DESlib library.

The DESlib provides an implementation of the KNORA algorithm with each dynamic ensemble selection technique via the KNORAE and KNORAU classes respectively.

Each matriculation can be used as a scikit-learn model directly, permitting the full suite of scikit-learn data preparation, modeling pipelines, and model evaluation techniques to be used directly.

Both classes use a k-nearest neighbor algorithm to select the neighbor with a default value of k=7.

A bootstrap team (bagging) ensemble of visualization trees is used as the pool of classifier models considered for each nomenclature that is made by default, although this can be reverted by setting “pool_classifiers” to a list of models.

We can use the make_classification() function to create a synthetic binary nomenclature problem with 10,000 examples and 20 input features.

Running the example creates the dataset and summarizes the shape of the input and output components.

Now that we are familiar with the DESlib API, let’s squint at how to use each KNORA algorithm on our synthetic nomenclature dataset.

KNORA-Eliminate (KNORA-E)

We can evaluate a KNORA-Eliminate dynamic ensemble selection algorithm on the synthetic dataset.

In this case, we will use default model hyperparameters, including unsober visualization trees as the pool of classifier models and a k=7 for the selection of the local neighborhood when making a prediction.

We will evaluate the model using repeated stratified k-fold cross-validation with three repeats and 10 folds. We will report the midpoint and standard deviation of the verism of the model wideness all repeats and folds.

The well-constructed example is listed below.

Running the example reports the midpoint and standard deviation verism of the model.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the stereotype outcome.

In this case, we can see the KNORA-E ensemble and default hyperparameters unzip a nomenclature verism of well-nigh 91.5 percent.

We can moreover use the KNORA-E ensemble as a final model and make predictions for classification.

First, the model is fit on all misogynist data, then the predict() function can be tabbed to make predictions on new data.

The example unelevated demonstrates this on our binary nomenclature dataset.

Running the example fits the KNORA-E dynamic ensemble selection algorithm on the unshortened dataset and is then used to make a prediction on a new row of data, as we might when using the model in an application.

Now that we are familiar with using KNORA-E, let’s squint at the KNORA-Union method.

KNORA-Union (KNORA-U)

We can evaluate a KNORA-Union model on the synthetic dataset.

In this case, we will use default model hyperparameters, including unsober visualization trees as the pool of classifier models and a k=7 for the selection of the local neighborhood when making a prediction.

We will evaluate the model using repeated stratified k-fold cross-validation with three repeats and 10 folds. We will report the midpoint and standard deviation of the verism of the model wideness all repeats and folds.

The well-constructed example is listed below.

Running the example reports the midpoint and standard deviation verism of the model.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the stereotype outcome.

In this case, we can see the KNORA-U dynamic ensemble selection model and default hyperparameters unzip a nomenclature verism of well-nigh 93.3 percent.

We can moreover use the KNORA-U model as a final model and make predictions for classification.

First, the model is fit on all misogynist data, then the predict() function can be tabbed to make predictions on new data.

The example unelevated demonstrates this on our binary nomenclature dataset.

Running the example fits the KNORA-U model on the unshortened dataset and is then used to make a prediction on a new row of data, as we might when using the model in an application.

Now that we are familiar with using the scikit-learn API to evaluate and use KNORA models, let’s squint at configuring the model.

Hyperparameter Tuning for KNORA

In this section, we will take a closer squint at some of the hyperparameters you should consider tuning for the KNORA model and their effect on model performance.

There are many hyperparameters we can squint at for KNORA, although in this case, we will squint at the value of k in the k-nearest neighbor model used in the local evaluation of the models, and how to use a custom pool of classifiers.

We will use the KNORA-Union as the understructure for these experiments, although the nomination of the explicit method is arbitrary.

Explore k in k-Nearest Neighbors

The configuration of the k-nearest neighbors algorithm is hair-trigger to the KNORA model as it defines the telescopic of the neighborhood in which each ensemble is considered for selection.

The k value controls the size of the neighborhood and it is important to set it to a value that is towardly for your dataset, specifically the density of samples in the full-length space. A value too small will midpoint that relevant examples in the training set might be excluded from the neighborhood, whereas values too large may midpoint that the signal is stuff washed out by too many examples.

The lawmaking example unelevated explores the nomenclature verism of the KNORA-U algorithm with k values from 2 to 21.

Running the example first reports the midpoint verism for each configured neighborhood size.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the stereotype outcome.

In this case, we can see that verism increases with the neighborhood size, perhaps to k=10, where it appears to level off.

A box and whisker plot is created for the distribution of verism scores for each configured neighborhood size.

We can see the unstipulated trend of increasing model performance and k value surpassing reaching a plateau.

Box and Whisker Plots of Verism Distributions for k Values in KNORA-U

Box and Whisker Plots of Verism Distributions for k Values in KNORA-U

Explore Algorithms for Classifier Pool

The nomination of algorithms used in the pool for the KNORA is flipside important hyperparameter.

By default, unsober visualization trees are used, as it has proven to be an constructive tideway on a range of nomenclature tasks. Nevertheless, a custom pool of classifiers can be considered.

In the majority of DS publications, the pool of classifiers is generated using either well known ensemble generation methods such as Bagging, or by using heterogeneous classifiers.

— Dynamic Classifier Selection: Recent Advances And Perspectives, 2018.

This requires first defining a list of classifier models to use and fitting each on the training dataset. Unfortunately, this ways that the will-less k-fold cross-validation model evaluation methods in scikit-learn cannot be used in this case. Instead, we will use a train-test split so that we can fit the classifier pool manually on the training dataset.

The list of fit classifiers can then be specified to the KNORA-Union (or KNORA-Eliminate) matriculation via the “pool_classifiers” argument. In this case, we will use a pool that includes logistic regression, a visualization tree, and a naive Bayes classifier.

The well-constructed example of evaluating the KNORA ensemble and a custom set of classifiers on the synthetic dataset is listed below.

Running the example first reports the midpoint verism for the model with the custom pool of classifiers.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the stereotype outcome.

In this case, we can see that the model achieved an verism of well-nigh 91.3 percent.

In order to prefer the KNORA model, it must perform largest than any contributing model. Otherwise, we would simply use the contributing model that performs better.

We can trammels this by evaluating the performance of each contributing classifier on the test set.

The updated example of KNORA with a custom pool of classifiers that are moreover evaluated independently is listed below.

Running the example first reports the midpoint verism for the model with the custom pool of classifiers and the verism of each contributing model.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the stereotype outcome.

In this case, we can see that then the KNORAU achieves an verism of well-nigh 91.3 percent, which is largest than any contributing model.

Instead of specifying a pool of classifiers, it is moreover possible to specify a each ensemble algorithm from the scikit-learn library and the KNORA algorithm will automatically use the internal ensemble members as classifiers.

For example, we can use a random forest ensemble with 1,000 members as the wiring classifiers to consider within KNORA as follows:

Tying this together, the well-constructed example of KNORA-U with random forest ensemble members as classifiers is listed below.

Running the example first reports the midpoint verism for the model with the custom pool of classifiers and the verism of the random forest model.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the stereotype outcome.

In this case, we can see that the KNORA model with dynamically selected ensemble members out-performs the random forest with the statically selected (full set) ensemble members.

Further Reading

This section provides increasingly resources on the topic if you are looking to go deeper.

Related Tutorials

Papers

Books

APIs

Summary

In this tutorial, you discovered how to develop dynamic ensemble selection models in Python.

Specifically, you learned:

  • Dynamic ensemble selection algorithms automatically segregate ensemble members when making a prediction on new data.
  • How to develop and evaluate dynamic ensemble selection models for nomenclature tasks using the scikit-learn API.
  • How to explore the effect of dynamic ensemble selection model hyperparameters on nomenclature accuracy.

Do you have any questions?
Ask your questions in the comments unelevated and I will do my weightier to answer.

Get a Handle on Modern Ensemble Learning!

Ensemble Learning Algorithms With Python

Improve Your Predictions in Minutes

...with just a few lines of python code

Discover how in my new Ebook:
Ensemble Learning Algorithms With Python

It provides self-study tutorials with full working code on:
Stacking, Voting, Boosting, Bagging, Blending, Super Learner, and much more...

Bring Modern Ensemble Learning Techniques to
Your Machine Learning Projects


See What's Inside