DecisionTree.jl
                                
                                 DecisionTree.jl copied to clipboard
                                
                                    DecisionTree.jl copied to clipboard
                            
                            
                            
                        Problem with adaboost
For some reason, boosting doesn't seem to work. I don't think the issue here is the same as #42. I tried the example from Elements of Statistical Learning and compared to fastAdaboost in R
julia> using Distributions, DecisionTree, RCall, DataFrames
julia> # Boosting example from EoSL
       X = randn(1000, 10);
julia> y = Vector{Int64}(vec(sum(abs2, X, 2) .> quantile(Chisq(10), 0.5)));
julia> # Use DecisionTree
       ada1 = DecisionTree.build_adaboost_stumps(y, X, 5);
julia> mean(apply_adaboost_stumps(ada1..., X) .== y)
0.579
julia> # Use fastAdaboost
       R"library(fastAdaboost)";
julia> df = DataFrame(X);
julia> df[:y] = y;
julia> ada2 = R"adaboost(y ~ x1 + x2 + x3 + x4 + x5 + x6 +x7 + x8 + x9 + x10, data = $df, 5)";
julia> rcopy(R"predict($ada2, newdata = $df)$error")
0.021
Furthermore, the build_adaboost_stumpss is much slower than adaboost from fastAdaboost. It looks like build_adaboost_stumps might not use the same optimizations as build_tree.
Yeah, build_adaboost_stumps has always had issues, and it does use a different optimization technique than build_tree, which is quite slow. Not sure what to do here; it requires significant work.
I've been wondering if we should remove it from the package all together.
Any thoughts, ideas, advice?
I might try to take a look at it and maybe I can figure out what is going on. If not it might be better to disable.