mlr3proba icon indicating copy to clipboard operation
mlr3proba copied to clipboard

Update, extend, cleanup distrcompositor

Open RaphaelS1 opened this issue 4 years ago • 3 comments

Following changes are required:

  1. Remove composition from crank to distr - This doesn't make any sense for abstract rankings, composition can only make sense for lpto distr.
  2. Add composition from response to distr - This can be most efficiently done by abstracting the probabilistic regression composition and using the same functions in both

(1) is higher priority as its results are meaningless.

RaphaelS1 avatar Dec 18 '20 09:12 RaphaelS1

Remove composition from crank to distr

Why? I can imagine cases where this makes sense!

The most obvious case would be calibration methods, e.g.: stick your predicted rank into a distribution as shape and/or location parameter; then do grid-tuning (or gradient descent tuning if/once autodiff is supported).

This is very common for probabilistic classifiers, and the probabilistic regression/survival counterpart is also not absurd.

Here's probability calibration in sklearn: https://scikit-learn.org/stable/modules/calibration.html

fkiraly avatar Dec 18 '20 13:12 fkiraly

Sorry I should clarify this:

The current composition assumes crank = lp and then uses a semi-parametric composition, e.g. h(t) = h_0(t)lp for baseline h_0.

However the new composition type to end (point 2.) instead uses the composition you describe, which can handle crank

RaphaelS1 avatar Dec 18 '20 14:12 RaphaelS1

ah, makes sense.

fkiraly avatar Dec 18 '20 15:12 fkiraly