parsnip icon indicating copy to clipboard operation
parsnip copied to clipboard

Interval predictions

Open hfrick opened this issue 2 years ago • 0 comments

Currently, predictions for confidence and prediction intervals are their own prediction type. This works well for the regression and classification modes with the intervals being based around either the numeric prediction or the prediction of the class probability. For the censored regression mode, it's ambigious which numeric prediction type the intervals refer to: it could be the survival probability, the quantiles (or also event time?).

So instead of having type = "conf_int" and type = "pred_inf", we are going to add them to the numeric and prob prediction types and their respective functions gain an argument interval = c("none", "confidence", "prediction"). (Getting both confidence and prediction intervals requires calling predict() twice.)

# mock-up

predict(lm_fit, new_data, interval = "prediction", level = 0.8)
#> # A tibble: 3 × 3
#>   .pred .pred_lower .pred_upper
#>   <dbl>       <dbl>       <dbl>
#> 1  22.6        20.7        24.5
#> 2  22.1        20.2        24.0
#> 3  26.3        24.5        28.0

predict(randomforest_fit, new_data, type = "prob", interval = "confidence")
#> # A tibble: 2 × 6
#>   .pred_Class1 .pred_Class2 .pred_lower_Class1 .pred_upper_Class1 .pred_lower_Class2 .pred_upper_Class2
#>          <dbl>        <dbl>              <dbl>              <dbl>              <dbl>              <dbl>
#> 1        0.520       0.480              0.0254                  1                  0              0.975
#> 2        0.961       0.0386             0.885                   1                  0              0.115

For the censored regression mode, see https://github.com/tidymodels/censored/issues/134.

hfrick avatar Dec 02 '21 18:12 hfrick