machine-learning icon indicating copy to clipboard operation
machine-learning copied to clipboard

Investigate 'boosting' implementation

Open jeff1evesque opened this issue 7 years ago • 2 comments

We need to investigate the requirements to streamline boosting implementation

jeff1evesque avatar Oct 08 '17 21:10 jeff1evesque

The following are specific adaboost implementations:

jeff1evesque avatar Oct 09 '17 22:10 jeff1evesque

AdaBoost is a way of creating classifiers for an ensemble.

The AdaBoost method takes a basic classifier, and attempts to fit the data to it. When the algorithm fails to classify something (say, a point called M), AdaBoost, on the next iteration, adds more of the failure point M.

So, when training the second, time, the dataset may have twice as many of the point "M". This leads to a higher chance that M is classified correctly in this classifier.

The algorithm does this N times to create N classifiers. These classifiers are then used in a typical ensemble classifier with a weighted majority vote to decide the outcome.

What this means is that classifier 1 may have no boosting, 2 will have the failures of classifier 1 boosted, 3 will have the failures of classifier 2 boosted, and so on, til classifier N, which will have the failures of classifier N-1 boosted.

The algorithm adapts to its failures, which is good in some cases, and detrimental in others. For example, in datasets with major outliers, the algorithm will significantly adapt to these outliers, which may skew its classification of normal data points.

protojas avatar Oct 13 '17 20:10 protojas