DecisionTree.jl icon indicating copy to clipboard operation
DecisionTree.jl copied to clipboard

Calculation of `new_coeff` in adaboost

Open paulstey opened this issue 8 years ago • 0 comments

I'm curious about the calculation of new_coeff in build_adaboost_stumps().

Just about every book and article I've found lists the formula as new_coeff = 0.5 * log((1 - err)/err). I was just curious why the function uses new_coeff = 0.5 * log((1 + err)/(1 - err)). I think this subtle difference might actually make quite an impact in accuracy.

I would also note that in the book Boosting by Schapire and Freund they point out that for each boosting round err ought to be approximately 0.5. And the current method does not behave that way; instead err tends towards 1.0, and then becomes NaN for all rounds afterwards.

Is there a good citation you can recommend for the current approach? Or, is this a possible oversight? I might be missing something.

Thanks in advance.

-Paul

P.S. Thanks for creating this package!

paulstey avatar Jan 26 '17 18:01 paulstey