MLJ.jl icon indicating copy to clipboard operation
MLJ.jl copied to clipboard

Out-of-fold predictions for unsupervised models?

Open salbalkus opened this issue 1 year ago • 2 comments

I have written a custom unsupervised learning model that implements transform and fit. I'd like to use MLJ to perform resampling and obtain out-of-fold output for this model. For example, given a 5-fold cross-validation, I would like to fit the model on four of the folds and obtain the output of transform on the left out fold - and then, repeat this until I've obtained "out-of-fold predictions" for each fold.

Is there (or could there be) an easy way to do using MLJ without having to write my own function to extract folds? As mentioned in #575, evaluate! with a custom measure that simply returns the prediction can accomplish this for supervised models. However, this does not appear to work for unsupervised models since evaluate! seems to require a predict method and a y value.

salbalkus avatar Jun 10 '23 01:06 salbalkus

@salbalkus Yes, currently evaluate! is limited to models that can predict, which does include outlier detection models, but not general unsupervised models.

I think your suggestion to add some kind of support makes sense. As you say, this is closely related to https://github.com/alan-turing-institute/MLJ.jl/issues/575 , which could be dealt with at the same time.

Also related is: https://github.com/JuliaAI/MLJBase.jl/issues/660

ablaom avatar Jun 15 '23 22:06 ablaom

In the meantime, you might be able to implement what you want using a learning network as well. The MLJ Stack composite model is implemented using learning networks, and there you splice together "out-of-fold predictions" just like you describe.

See this tutorial for the general idea. The way learning networks are exported as stand-alone models has changed (@from_network appearing in the tute is deprecated). See here for the latest docs.

ablaom avatar Jun 15 '23 23:06 ablaom