mlr3 icon indicating copy to clipboard operation
mlr3 copied to clipboard

predict_types should be an active binding

Open pfistfl opened this issue 2 years ago • 2 comments

Since I am stumbling over this for the nth time:

predict_type and predict_types are easy to confuse (and it happens to me quite a lot).

  • predict_type: The concrete type of the prediction the learner should yield
  • predict_types: The theoretical capabilities of the learner: Which types of prediction can it yield. I think for less involved users, this might be an even bigger problem.

Solution:

predict_types should IMHO be immutable (this is a property of the learner). -> Encode as an AB and if the user tries to set it point her/him to predict_type in the error message

More generally: Do we need predict_type?

  • Is there a single learner that uses it during training?
  • predict_type is mutable after training so we can break learners if they were to use it
  • If we e.g. want to mutate predict_types after training for BenchmarkResults this is not possible, anymore due to the use of read-only AB's.
  • For most cases, it could just be an extra arg added to $predict instead.

Sidenote Default

It's super annoying (and IMHO unncessary) to set predict_type = "prob". I always remember that when I try to use probabilistic measures AFTER having trained the model. And in 99.9% of cases it does not matter what predict type was set for the inducer and I could change it post-hoc (which afaik works as long as the learner is not e.g. in a resample result where things are way more difficult). Question: Should we not by default predict prob IF the learner can do it? Do we have any learners that can not predict prob?

pfistfl avatar Aug 24 '22 07:08 pfistfl

Is there a single learner that uses it during training?

Grepping a bit:

I'd conclude that for the few cases where predict_type is used in training, it could be a hyperparameter -- for xgboost and earth it already is, in a way, since predict_type is only used here to check if the hyperparameters are set up correctly. I think we could therefore just use $predict_type during prediction (or as an argument of $predict(); or both, with the latter overriding the former), throwing errors during prediction in the few cases where different HPs were needed during training.

mb706 avatar Sep 26 '22 10:09 mb706

I guess to move forward here we would need the opinion of @berndbischl and @mllg?

pfistfl avatar Oct 25 '22 09:10 pfistfl