AdaBoostStumpClassifier MethodError: zero(::Type{Symbol})
The function fit! fails with number of iterations > 5.
bdt = let
_model = AdaBoostStumpClassifier(; n_iterations = 10)
fit!(_model, X_train, y_train)
end
fails with an error,
MethodError: no method matching zero(::Type{Symbol})
The function `zero` exists, but no method is defined for this combination of argument types.
Closest candidates are:
zero(::Type{Union{}}, Any...)
@ Base number.jl:310
zero(::Type{Dates.DateTime})
@ Dates ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/Dates/src/types.jl:458
zero(::Type{Pkg.Resolve.VersionWeight})
@ Pkg ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/Pkg/src/Resolve/versionweights.jl:15
...
It depends on dataset to train, see MWE, it works on one set, fails on the other
MWE
begin
using Random
using DataFrames
using DecisionTree
Random.seed!(1234)
end
function classify_signal_background(x, y)
# Sinusoidal boundary
# if sin(2.5π * (x - 0.55)) / 5 + 0.3 + 0.4x < y < 0.7 + 0.4x # note: this one has no problem
if (x-0.25)^2 + (y-0.25)^2 < 0.05 || (x-0.65)^2 + (y-0.65)^2 < 0.05
return :signal
else
return :background
end
end
const features = [:f1, :f2];
df = let
_df = DataFrame(rand(500, 2), features)
transform!(_df, features => ByRow(classify_signal_background) => :y)
end
bdt = let
_model = AdaBoostStumpClassifier(; n_iterations = 40)
X_train = df[:,features] |> Matrix
y_train = df[:, :y]
fit!(_model,X_train, y_train)
end
@mmikhasenko Thanks for reporting and providing a MWE, which I have been able to reproduce. I am not a regular maintainer, but agree the documentation suggests that labels can be arbitrarily encoded in classification, so this is indeed a bug. (If you use the MLJ interface, you will be required to encode the target y_train as a CategoricalVector, and internally the target will be integer-encoded, in which case I would not expect to see this error.)
I have not diagnosed the precise issue, but notice the following workaround appears to work: Recode the labels as integers:
y_train = map(y_train) do y
y == :signal ? 1 : 0
end
Curiously, you can alternatively recode as strings and no zero(::String) error is thrown.
y_train = string.(y_train)
If I have some more time, I may take a deeper look at this. In the meantime, perhaps another maintainer can take a look.
@mmikhasenko If you do diagnose this yourself, I can promise a timely review of any PR.