Introduction

Ensemble learning algorithms are those techniques that combine the predictions of two or more machine learning algorithms with the goal of improving predictive skill. Ensemble learning algorithms are a more advanced subfield of machine learning, often turned to on machine learning projects when predictive performance is the most important objective. As such, ensembles are widely used by top participants and winners of competitive machine learning competitions.

What is Ensemble Learning

Many decisions we make in life are based on the opinions of multiple other people. This includes choosing a book to read based on reviews, choosing a course of action based on the advice of multiple medical doctors, and determining guilt. Often, decision making by a group of individuals results in a better outcome than a decision made by any one member of the group. This is generally referred to as the wisdom of the crowd. We can achieve a similar result by combining the predictions of multiple machine learning models for regression and classification predictive modeling problems.

Applied machine learning often involves fitting and evaluating models on a dataset. Given that we cannot know which model will perform best on the dataset beforehand, this may involve a lot of trial and error until we find a model that performs well or best for our project. This is akin to making a decision using a single expert. Perhaps the best expert we can find. A complementary approach is to prepare multiple different models, then combine their predictions. This is called an ensemble machine learning model, or simply an ensemble, and the process of finding a well-performing ensemble model is referred to as ensemble learning.