MLJ.jl icon indicating copy to clipboard operation
MLJ.jl copied to clipboard

Extract y and yhat for each test fold from results of evaluate!

Open CameronBieganek opened this issue 4 years ago • 3 comments

Sometimes one wants to look at the actual and predicted y values for each test fold in a cross-validation. For example, one might want to make a plot of the residuals versus the predicted values. As far as I can tell, there's not an easy way to do that right now.

This is mentioned in #89, but I thought it would be good to have a more specific issue.

The scikit-learn equivalent of this feature request is cross_val_predict().

CameronBieganek avatar Jun 18 '20 22:06 CameronBieganek

Well, you can give evaluate/evaluate! a custom measure (see https://alan-turing-institute.github.io/MLJ.jl/dev/performance_measures/#Traits-and-custom-measures-1) and "measure" can be just about any function of the data (this issue notwithstanding: https://github.com/alan-turing-institute/MLJBase.jl/issues/352). So, how about this?

y = float.(1:12)

model = ConstantRegressor()

predicted_target(yhat, y) = yhat
target(yhat, y) = y
MLJ.reports_each_observation(::typeof(predicted_target)) = true
MLJ.reports_each_observation(::typeof(target)) = true

e = evaluate(model, X, y,
         measures=[predicted_target, target],
         resampling=CV(nfolds=3),
         operation=predict_mean)

julia> e.per_observation
2-element Array{Array{Array{Float64,1},1},1}:
 [[8.5, 8.5, 8.5, 8.5], [6.5, 6.5, 6.5, 6.5], [4.5, 4.5, 4.5, 4.5]]   
 [[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0], [9.0, 10.0, 11.0, 12.0]]

ablaom avatar Jun 25 '20 19:06 ablaom

Thanks! That's a pretty good solution. However, I think it would be worth adding a method to the MLJ API that does this out of the box. I think something like the following could be pretty useful.

struct CVPredictions{T}
    ŷ::Vector{<:AbstractVector{T}}
    y::Vector{<:AbstractVector{T}}
end

cv_predict(model, X, y) = # returns a CVPredictions object
cv_predict!(mach) = # returns a CVPredictions object

function evaluate(cvp; measure)
    # evaluate measures on each fold
    # return model evaluation
end

function evaluate(model, X, y; measure)
    cvp = cv_predict(model, X, y)
    evaluate(cvp; measure)
end

function evaluate!(mach; measure)
    cvp = cv_predict!(mach)
    evaluate(cvp; measure)
end

export cv_predict, cv_predict!, evaluate, evaluate!

Motivating example

Here's one example where it would be nice to have the separate cv_predict function. Suppose I have a classification task. Suppose that running evaluate is expensive because I have a lot of data and/or I'm doing a grid search. I would like to be able to do my cross-validation just once and save the predictions to disk. That way if I want to evaluate other metrics that I didn't evaluate the first time, I can simply run evaluate(cvp; measure=newmeasures), which should be fast.

To extend the example, suppose I define a measure like this:

cost = let
    cost_matrix = [0   10;
                   100  0]
    
    function cost(ŷ, y)
        confusion = confusion_matrix(ŷ, y)
        sum(confusion .* cost_matrix)
    end
end

Then if I later decide that I want to change the cost matrix, it would be nice if I could just run evaluate(cvp; measure=cost) instead of having to re-run the cross-validation.

CameronBieganek avatar Jul 24 '20 20:07 CameronBieganek

So basically you just want to insert a new interface point. Sounds like a good idea. A few comments:

  • some measures require weightsw (weights) to be evaluated, and some (usually custom) measures also require X. If X is large, this could be a problem. Do you have a suggestion for handling this case?

  • we'd probably want to have acceleration options for cv_predict and evaluate

  • not too crazy on the name, as resampling includes things like Holdout which is not CV. Maybe resample_predict and ResampledPredictions ??

ablaom avatar Aug 05 '20 23:08 ablaom