DecisionTree.jl
DecisionTree.jl copied to clipboard
Calculation of `new_coeff` in adaboost
I'm curious about the calculation of new_coeff
in build_adaboost_stumps()
.
Just about every book and article I've found lists the formula as new_coeff = 0.5 * log((1 - err)/err)
. I was just curious why the function uses new_coeff = 0.5 * log((1 + err)/(1 - err))
. I think this subtle difference might actually make quite an impact in accuracy.
I would also note that in the book Boosting by Schapire and Freund they point out that for each boosting round err
ought to be approximately 0.5. And the current method does not behave that way; instead err
tends towards 1.0, and then becomes NaN
for all rounds afterwards.
Is there a good citation you can recommend for the current approach? Or, is this a possible oversight? I might be missing something.
Thanks in advance.
-Paul
P.S. Thanks for creating this package!