6 secrets of building better models part one: bootstrap aggregation

Many analysts who are interested in building predictive models invest a lot of their time and effort in trying to understand how to best tune the parameters of the specific technique that they are using, whether that technique be logistic regression or a neural network, and they are doing this in order to achieve the best accuracy of the resultant model. In this series of videos we look at some often overlooked approaches that can be applied in the same way to a wide variety of algorithms and which may lead to better predictive accuracy. In all of our examples we’ll focus on improving the accuracy of a predictive model applied to a classification prediction problem.

Bootstrap aggregation or bagging

Bootstrap aggregation, also called bagging, is a random ensemble method designed to increase the stability and accuracy of models. It involves creating a series of models from the same training data set by randomly sampling with replacement the data. Sampling with replacement means that a specific row of data may appear more than once in the subsequent random sample. This means that each resultant model is trained against a slightly different sample of data. The resultant predictions from the multiple models are then all combined to create a single score.

Watch this video to find out more

Check out the other videos in this series