MLJ.jl
MLJ.jl copied to clipboard
models(matching(X, y)) for Images
Currently: models(matching(X, y)) doesn't return relevant models when X has images
import Flux;
X, y = Flux.Data.MNIST.images(), Flux.Data.MNIST.labels()
typeof(X), typeof(y)
models(matching(X, y))
I'm not sure if this is outside the scope of models(matching(X, y)).
In principal it can also return all Time-series models etc
models(matching(X, y), x -> x.TS == true)
My favorite application of multiple Julia classifiers (XGBoost.jl, Flux, NaiveBayes.jl etc) on image data is: @oxinabox's https://white.ucc.asn.au/2017/12/18/7-Binary-Classifier-Libraries-in-Julia.html
In MLJ you can't use integers to encode categorical data:
scitype(y)
AbstractArray{Count,1}
Fix:
y = coerce(y, Multiclass);
Now you can find the MLJFlux model:
julia> models(matching(X, y))
1-element Array{NamedTuple{(:name, :package_name, :is_supervised, :docstring, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :is_pure_julia, :is_wrapper, :load_path, :package_license, :package_url, :package_uuid, :prediction_type, :supports_online, :supports_weights, :input_scitype, :target_scitype, :output_scitype),T} where T<:Tuple,1}:
(name = ImageClassifier, package_name = MLJFlux, ... )
You don't find the tree boosters because their current MLJ implementations expect tabular data.
julia> info("XGBoostClassifier").input_scitype
Table{_s23} where _s23<:(AbstractArray{_s25,1} where _s25<:Continuous)
while
julia> scitype(X)
AbstractArray{GrayImage{28,28},1}
You can still use them but you need to pre-process the data in the form of tables (currently). It might be useful to have transformer to to this kind of thing, but I have not looked into it.