MLJ.jl icon indicating copy to clipboard operation
MLJ.jl copied to clipboard

Unexpected loading behavior

Open azev77 opened this issue 4 years ago • 27 comments

I installed a fresh version of Julia 1.6 on my macbook. 1 installed the latest MLJ 1.6 then I tried to auto load all MLJ relevant packages

@inline function load_m(model_list)
    model_types = []
    @inbounds for model in model_list #models()
        T = @eval @load $(model.name) pkg=$(model.package_name) add=true # verbosity=0
        push!(model_types, T)
    end
    model_instances = map(T -> T(), model_types)
end
load_m(models()) # Load all models ONCE!

Now my MLJ version is 15.2

why am I getting downgraded?

azev77 avatar Mar 25 '21 20:03 azev77

I think I found the culprit


julia> Pkg.add("NaiveBayes")
   Resolving package versions...
   Installed GPUCompiler ─ v0.9.2
   Installed CUDA ──────── v2.6.0
    Updating `~/.julia/environments/v1.6/Project.toml`
  [f6006082] ↓ EvoTrees v0.7.0 ⇒ v0.5.3
  [add582a8] ↓ MLJ v0.16.0 ⇒ v0.15.2
  [d491faf4] ↓ MLJModels v0.14.1 ⇒ v0.13.3
  [9bbee03b] + NaiveBayes v0.5.0

azev77 avatar Mar 25 '21 20:03 azev77

Good work identifying.

I'll investigate.

ablaom avatar Mar 25 '21 21:03 ablaom

This should resolve the issue:

https://github.com/dfdx/NaiveBayes.jl/pull/51

ablaom avatar Mar 25 '21 21:03 ablaom

I can locally get all three:

(jl_QLSWNr) pkg> st
Status `/private/var/folders/4n/gvbmlhdc8xj973001s6vdyw00000gq/T/jl_QLSWNr/Project.toml`
  [f6006082] EvoTrees v0.7.0
  [add582a8] MLJ v0.16.0
  [9bbee03b] NaiveBayes v0.5.0 `../../../../../../../Users/anthony/Dropbox/Julia7/NaiveBayes`

ablaom avatar Mar 25 '21 22:03 ablaom

But do you know why installing a new pkg downgraded MLJ automatically & can MLJ make this not happen w/o consent?

azev77 avatar Mar 25 '21 22:03 azev77

Isn't this is essentially a pkg manager issue, not an MLJ issue?

If you want fix the MLJ version, then do ]pin MLJ or Pkg.pin("MLJ"). Then pkg manager will throw a conflict error if adding a new package pushes MLJ down.

ablaom avatar Mar 25 '21 22:03 ablaom

(pin is also an indespensible tool for debugging pkg conflicts)

ablaom avatar Mar 25 '21 22:03 ablaom

You're prob correct, I wanted to make sure there is no MLJ setting that is behind this.

These kinds of things can create a lot of issues for newcomers esp. Do you think it's worth adding this (auto load models) to Tests?

azev77 avatar Mar 25 '21 23:03 azev77

We essentially have a test like this: MLJModels.check_registry(). See https://github.com/alan-turing-institute/MLJModels.jl/blob/dev/src/registry/src/check_registry.jl .

Although, we only run it after registry updates, not in CI.

ablaom avatar Mar 26 '21 00:03 ablaom

These kinds of things can create a lot of issues for newcomers esp.

Agreed! Pkg management has been a huge stumbling block for me.

Maybe there is a good tutorial introducing package management to newcomers, I don't know. But I sure wish there had been one when I started out. 😢

instantatiate -p solves a lot of early pitfalls

I think the julia developers are aware of the importance of making pkg management more user friendly. And the status quo is better than the conda/pip/virualenv mess. I don't think it's easy to have a pkg management system without also having some kind of user training. I'm no expert, but I understand the real source of Dependency Hell is that fact that the package dependency problem is NP-hard which surprised me when I first encountered it. I think that's part of the problem. We intuitively feel pkg management ought to be something that can be fully automated, but that's hard to do.

ablaom avatar Mar 26 '21 01:03 ablaom

Would it be a big hassle to run it in CI?

azev77 avatar Mar 26 '21 01:03 azev77

Well, this requires pre-compiling and loading the entire stack of model-providing packages, which takes a while, as you know.

I suppose we could add it to the dev->master PR's only (but not part of runtests). Would you see that as sufficient?

That is, for such PR's to run

using MLJModels
MLJModels.check_registry()

and throw a build fail if an exception is thrown.

@DilumAluthge Is this something you could help with? On MLJModels, not MLJ.

ablaom avatar Mar 26 '21 02:03 ablaom

That would be great!

azev77 avatar Mar 26 '21 02:03 azev77

Yeah shouldn't be too hard.

DilumAluthge avatar Mar 26 '21 02:03 DilumAluthge

I'll work on it here: https://github.com/alan-turing-institute/MLJModels.jl/pull/369

DilumAluthge avatar Mar 26 '21 02:03 DilumAluthge

Just tried loading all on my work computer. Still have same problem w/ NaiveBayes

azev77 avatar Mar 26 '21 23:03 azev77

It seems we are still waiting for a patch release from NaiveBayes: https://github.com/dfdx/NaiveBayes.jl/pull/51#issuecomment-809846249

BTW @DilumAluthge has now added check_registry to the MLJModels CI and it seems to be working.

ablaom avatar Mar 30 '21 01:03 ablaom

It works now, but there are some instabilities. The first time I load all packages I get an error:

@inline function load_m(model_list)
    model_types = []
    @inbounds for model in model_list #models()
        if (
            1 == 1
            #model.package_name != "MLJNaiveBayesInterface" 
            #&& 
            #model.package_name !=  "NaiveBayes"
            )
            println(model.name, " ", model.package_name)
            try
                T = @eval @load $(model.name) pkg=$(model.package_name) add = true #add=true # verbosity=0
                push!(model_types, T)    
            catch e
                return e
            end 
        end     
    end
    model_instances = map(T -> T(), model_types)
end
models() |> load_m # Load all models ONCE!

#############
...

XGBoostClassifier XGBoost
[ Info: For silent loading, specify `verbosity=0`.
import MLJXGBoostInterface ✔
XGBoostCount XGBoost
[ Info: For silent loading, specify `verbosity=0`.
import MLJXGBoostInterface ✔
XGBoostRegressor XGBoost
[ Info: For silent loading, specify `verbosity=0`.
import MLJXGBoostInterface ✔
ERROR: MethodError: no method matching MLJScikitLearnInterface.ARDRegressor()
The applicable method may be too new: running in world age 29720, while current world is 30272.
Closest candidates are:
  MLJScikitLearnInterface.ARDRegressor(; n_iter, tol, alpha_1, alpha_2, lambda_1, lambda_2, compute_score, threshold_lambda, fit_intercept, normalize, copy_X, verbose) at none:0 (method too new to be called from this world context.)
Stacktrace:
 [1] (::var"#5#6")(T::Type{MLJScikitLearnInterface.ARDRegressor})
   @ Main .\REPL[2]:19
 [2] iterate
   @ .\generator.jl:47 [inlined]
 [3] _collect(c::Vector{Any}, itr::Base.Generator{Vector{Any}, var"#5#6"}, #unused#::Base.EltypeUnknown, isz::Base.HasShape{1})
   @ Base .\array.jl:691
 [4] collect_similar
   @ .\array.jl:606 [inlined]
 [5] map
   @ .\abstractarray.jl:2294 [inlined]
 [6] load_m
   @ .\REPL[2]:19 [inlined]
 [7] |>(x::Vector{NamedTuple{(:name, :package_name, :is_supervised, :docstring, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :is_pure_julia, :is_wrapper, :load_path, :package_license, :package_url, :package_uuid, :prediction_type, :supports_online, :supports_weights, :input_scitype, :target_scitype, :output_scitype), T} where T<:Tuple}, f::typeof(load_m))
   @ Base .\operators.jl:858
 [8] top-level scope
   @ REPL[3]:1

julia> 

But it works when I run it a second time.

azev77 avatar Mar 30 '21 20:03 azev77

I can also get this behaviour but, in addition I get occasional complaints (first time around) like this one:

EvoTreeClassifier EvoTrees
[ Info: For silent loading, specify `verbosity=0`. 
[ Info: Precompiling EvoTrees [f6006082-12f8-11e9-0c9c-0d5d367ab1e5]
ERROR: LoadError: ArgumentError: Package CUDA [052768ef-5323-5732-b1bb-66c8b64840ba] is required but does not seem to be installed:
 - Run `Pkg.instantiate()` to install all recorded dependencies.

Stacktrace:  [1] _require(pkg::Base.PkgId)
   @ Base ./loading.jl:990
 [2] require(uuidkey::Base.PkgId)
   @ Base ./loading.jl:914
 [3] require(into::Module, mod::Symbol)
   @ Base ./loading.jl:901
 [4] include
   @ ./Base.jl:386 [inlined]
 [5] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, 
dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::Nothing)
   @ Base ./loading.jl:1213
 [6] top-level scope
   @ none:1
 [7] eval
   @ ./boot.jl:360 [inlined]
 [8] eval(x::Expr)
   @ Base.MainInclude ./client.jl:446
 [9] top-level scope
   @ none:1
in expression starting at /Users/anthony/Dropbox/Julia7/EvoTrees/src/EvoTrees.jl:1

which don't understand, as CUDA is in the Project.toml file of the EvoTrees.jl package. However, it seems CUDA (and the EvoTrees.jl models) are nevertheless installed anyway.

Once the complete environment is built (which happens by the end of the first call) there does not appear to be any problems. (So your issue has a workaround - just start with a pre-populated environment.) But there is something funny going on when building the environment as you go.

@giordano Do you have any ideas?

ablaom avatar Apr 07 '21 02:04 ablaom

Just tried again. Got the same error message the 1st time, not 2nd time. Same as before...

azev77 avatar Apr 12 '21 00:04 azev77

Appreciate that. Unfortunately, I don't have a lot of bandwidth to spare for this issue just now. We may need some help but this will be easier if we can isolate this a little better. For example, can you reproduce the error with, say, dropping the loop and just calling using model = models()[1]?

ablaom avatar Apr 12 '21 02:04 ablaom

I tried to reproduce the issue in an environment with only MLJ.jl installed with this code:

using MLJ
@inline function load_m(model_list)
    model_types = []
    @inbounds for model in model_list #models()
        if (
            1 == 1
            #model.package_name != "MLJNaiveBayesInterface" 
            #&& 
            #model.package_name !=  "NaiveBayes"
            )
            println(model.name, " ", model.package_name)
            try
                T = @eval @load $(model.name) pkg=$(model.package_name) add = true #add=true # verbosity=0
                push!(model_types, T)    
            catch e
                return e
            end 
        end     
    end
    model_instances = map(T -> T(), model_types)
end
models() |> load_m # Load all models ONCE!

But I don't get your error on EvoTrees.jl. I believe the overall issue is that you're trying to smash together some hundreds of packages in the same environment and you're set to get into version incompatibilities. I'm not sure MLJ.jl is related to this though.

giordano avatar Apr 12 '21 16:04 giordano

The following compromise seems to work for my blog post. I can add the latest versions of all packages except: Conda/PyCall/DataFrames/MLJBase. I add the immediate previous version for those four packages.

import Pkg;
Pkg.activate(mktempdir())
Pkg.add([
    Pkg.PackageSpec(name="Conda", version="1.5.1"), #1.5.2
    Pkg.PackageSpec(name="PyCall", version="1.92.2"), #1.92.3
    Pkg.PackageSpec(name="MLJ", version="0.16.2"),
    Pkg.PackageSpec(name="MLJScikitLearnInterface", version="0.1.9"), #0.1.9
    Pkg.PackageSpec(name="RDatasets", version="0.7.5"),
    Pkg.PackageSpec(name="DataFrames", version="0.22.7"), #1.0.1  #0.22.7
    Pkg.PackageSpec(name="MLJDecisionTreeInterface", version="0.1.2"),
    Pkg.PackageSpec(name="MLJMultivariateStatsInterface", version="0.2.2"),
    Pkg.PackageSpec(name="MLJModels", version="0.14.4"),
    Pkg.PackageSpec(name="BetaML", version="0.5.0"),
    Pkg.PackageSpec(name="MLJBase", version="0.18.1"), #0.18.2
    Pkg.PackageSpec(name="MLJLinearModels", version="0.5.4"), 
    Pkg.PackageSpec(name="MLJLIBSVMInterface", version="0.1.3"), 
    Pkg.PackageSpec(name="EvoTrees", version="0.7.5"),
    Pkg.PackageSpec(name="MLJNaiveBayesInterface", version="0.1.3"),
    Pkg.PackageSpec(name="MLJXGBoostInterface", version="0.1.3"),
    Pkg.PackageSpec(name="NearestNeighborModels", version="0.1.5"), 
    Pkg.PackageSpec(name="ParallelKMeans", version="0.2.2"), 
    Pkg.PackageSpec(name="PartialLeastSquaresRegressor", version="2.2.0"), 
    Pkg.PackageSpec(name="MLJFlux", version="0.1.10"), 
    Pkg.PackageSpec(name="MLJClusteringInterface", version="0.1.4"),
    Pkg.PackageSpec(name="LightGBM", version="0.5.2"),
    Pkg.PackageSpec(name="MLJGLMInterface", version="0.1.4"),
    #
    Pkg.PackageSpec(name="Plots", version="1.12.0"),
])

I still get a strange error when I load all models the first time (it works the second time)

# load_m() Install/Load all wrapped models 
@inline function load_m(model_list)
    model_types = []
    @inbounds for model in model_list #models()
            println(model.name, " ", model.package_name)
            try
                T = @eval @load $(model.name) pkg=$(model.package_name) add = true #add=true # verbosity=0
                push!(model_types, T)    
            catch e
                return e
            end 
        end     
    end
    model_instances = map(T -> T(), model_types)
end
models() |> load_m # Load all models


...
XGBoostRegressor XGBoost
[ Info: For silent loading, specify `verbosity=0`.
import MLJXGBoostInterface ✔
ERROR: MethodError: no method matching MLJScikitLearnInterface.ARDRegressor()
The applicable method may be too new: running in world age 29729, while current world is 30240.
Closest candidates are:
  MLJScikitLearnInterface.ARDRegressor(; n_iter, tol, alpha_1, alpha_2, lambda_1, lambda_2, compute_score, threshold_lambda, fit_intercept, normalize, copy_X, verbose) at none:0 (method too new to be called from this world context.)
Stacktrace:
 [1] (::var"#5#6")(T::Type{MLJScikitLearnInterface.ARDRegressor})
   @ Main .\REPL[9]:24
 [2] iterate
   @ .\generator.jl:47 [inlined]
 [3] _collect(c::Vector{Any}, itr::Base.Generator{Vector{Any}, var"#5#6"}, #unused#::Base.EltypeUnknown, isz::Base.HasShape{1})
   @ Base .\array.jl:691
 [4] collect_similar
   @ .\array.jl:606 [inlined]
 [5] map
   @ .\abstractarray.jl:2294 [inlined]
 [6] load_m
   @ .\REPL[9]:24 [inlined]
 [7] |>(x::Vector{NamedTuple{(:name, :package_name, :is_supervised, :docstring, :hyperparameter_ranges, :hyperparameter_types, :hyperparameters, :implemented_methods, :is_pure_julia, :is_wrapper, :iteration_parameter, :load_path, :package_license, :package_url, :package_uuid, :prediction_type, :supports_class_weights, :supports_online, :supports_training_losses, :supports_weights, :input_scitype, :target_scitype, :output_scitype), T} where T<:Tuple}, f::typeof(load_m))
   @ Base .\operators.jl:858
 [8] top-level scope
   @ REPL[10]:1

azev77 avatar Apr 26 '21 00:04 azev77

MLJ has MLJBase and MLJModels as dependencies, so there shouldn't be a need to explicitly add MLJBase and MLJModels to the environment, right? Any methods of MLJBase and MLJModels of use to the general users are re-exported by MLJ (if you find one that is not, post an issue).

ablaom avatar Apr 26 '21 01:04 ablaom

Curious why you need to explicitly import PyCall and Conda (although understand they are likely going to be in the Manifest).

ablaom avatar Apr 26 '21 01:04 ablaom

I need to import the older versions of Conda & PyCall for sklearn to work

azev77 avatar Apr 26 '21 02:04 azev77

The difference between your versions and the latest are a single patch release. And the patches don't appear to make changes to Project.toml files. So this is a little surprising. Mmm.

ablaom avatar Apr 26 '21 22:04 ablaom