DataScienceTutorials.jl More sophisticated example of learning networks

@ablaom it would be nice to have a quick example of something that can't be done with a pipeline; I was thinking something like

L1 : standardizer and boxcox L2 : Ridge and DTR L3 : hcat and feed to a LinearRegressor

It should look something like this

W = X |> Standardizer()
z = y |> UnivariateBoxCoxTransformer()
ẑ₁ = (W, z) |> RidgeRegressor()
ẑ₂ = (W, z) |> DecisionTreeRegressor()
R = hcat(ẑ₁, ẑ₂)
ẑ = (R, z) |> LinearRegressor()
ŷ = ẑ |> inverse_transform(z)

but it looks like there's an issue with fitting the R node, could you comment on it?

ERROR: MethodError: no method matching ridge(::Array{Any,2}, ::Array{Float64,1}, ::Float64)

Oct 22 '19 16:10 tlienart

Hmm the KNNRidge blend example is actually just that. But I'd still be curious to understand what I did wrong above

Oct 22 '19 16:10 tlienart

Your problem: LinearRegressor() expects a table but it is getting a matrix here. So you need to insert an MLJTable.
Another problem: You can't apply the inverse_transform directly to ẑ because it makes probablisitic predictions. So you need to insert a mean. Sadly, mean is not overloaded for nodes, so you need to use the node method to do this yourself

The following code works:

using MLJ

Xr, yr = @load_boston
zr = rand(length(yr))

X = source(Xr)
y = source(yr)
z = source(zr)

W = X |> Standardizer()
z = y |> UnivariateBoxCoxTransformer()
ẑ₁ = (W, z) |> @load RidgeRegressor pkg=MultivariateStats
ẑ₂ = (W, z) |> @load DecisionTreeRegressor
R = MLJ.table(hcat(ẑ₁, ẑ₂))
ẑ_prob = (R, z) |> @load LinearRegressor pkg=GLM
ẑ = node(v->mean.(v), ẑ_prob)
ŷ = ẑ |> inverse_transform(z)
fit!(ŷ)
ŷ()

Side comment: A great non-trivial example is stacking, as suggested by the diagram on the MLJ Readme.

Oct 23 '19 01:10 ablaom

Or you could have used predict_mean but then you couldn't use the arrow syntax, I guess.

Oct 23 '19 01:10 ablaom