marginaleffects icon indicating copy to clipboard operation
marginaleffects copied to clipboard

`predictions()`: Allow `ci_method` and `ci_type` arguments

Open arthur-albuquerque opened this issue 3 years ago • 8 comments
trafficstars

Edited obsolete

arthur-albuquerque avatar Jul 31 '22 03:07 arthur-albuquerque

You can do:

predictions(mod, vcov = "satterthwaite")

vincentarelbundock avatar Jul 31 '22 04:07 vincentarelbundock

Oh ok, nice! Thanks.

Another Q: insight::get_predicted() supports an argument to calculate the prediction interval (predict = "prediction").

I couldn't find a similar option in marginaleffects::predictions' documentation.

Is that currently available?

arthur-albuquerque avatar Jul 31 '22 15:07 arthur-albuquerque

In principle, I could add an option to do type="prediction". However, I must admit that I am not super familiar with what insight does to get prediction intervals. In particular,

  • What model types are supported?
  • Will all unsupported types return an error?
  • What specific mathematical operation is done to compute prediction intervals for each supported model?
  • Is there a reference to published literature showing the statistical properties of the resulting estimates?

I could not immediately find the answers in the documentation, and I hesitate to implement this without more information about the procedure.

I'll be happy to do this if someone can help me fill the gap in my knowledge here.

vincentarelbundock avatar Jul 31 '22 18:07 vincentarelbundock

Understand your concerns. I am no insight expert; perhaps @bwiernik could chip in :)

arthur-albuquerque avatar Jul 31 '22 18:07 arthur-albuquerque

I'm not sure he was too involved with the get_predicted() function...

I could look into this myself, but I don't really have time right now unfortunately.

vincentarelbundock avatar Jul 31 '22 18:07 vincentarelbundock

In principle, I could add an option to do type="prediction". However, I must admit that I am not super familiar with what insight does to get prediction intervals. In particular,

It's in get_predicted_se(). I think it's only called for Gaussian models.

  # add sigma to standard errors, i.e. confidence or prediction intervals
  ci_type <- match.arg(ci_type, c("confidence", "prediction"))
  if (ci_type == "prediction") {
    if (is_mixed_model(x)) {
      se <- sqrt(var_diag + get_variance_residual(x))
    } else {
      se <- sqrt(var_diag + get_sigma(x)^2)
    }
  } else {
    se <- sqrt(var_diag)
  }

strengejacke avatar Aug 15 '22 14:08 strengejacke

And for Bayesian models it's to call posterior_predict() instead of posterior_epred()

bwiernik avatar Aug 15 '22 21:08 bwiernik

Thanks both!

vincentarelbundock avatar Aug 16 '22 18:08 vincentarelbundock

If you would be interested to allow for prediction intervals you might like to look into https://rdrr.io/cran/ciTools/, which supports (log-) linear, (log-) linear mixed, generalized linear models, generalized linear mixed models, and accelerated failure time models for the moment...

tomwenseleers avatar Aug 26 '22 13:08 tomwenseleers

Thanks for the link.

At the moment, marginaleffects delegates most of the computation made by the predictions() function to the insight::get_predicted() function (marginaleffects handles the computation for all other functions by itself). Currently, I have no plans to take over for insight, so predictions intervals for other types of models would have to be implemented there upstream.

The question in this thread is whether we want to allow predictions() to access insight's existing prediction intervals capabilities.

vincentarelbundock avatar Aug 26 '22 14:08 vincentarelbundock

@DominiqueMakowski @bwiernik we might look at ciTools how they calculate prediction intervals for those models that we haven't covered yet. For glm.nb, for instance, they simultate from the response and cut off the 95% interval from those values (https://github.com/jthaman/ciTools/blob/master/R/add_pi_negbin.R) - doesn't look too complicated.

strengejacke avatar Aug 26 '22 16:08 strengejacke

See also arm::sim() for a similar approach

bwiernik avatar Aug 26 '22 17:08 bwiernik

(and parameters::simulate_parameters())

strengejacke avatar Aug 26 '22 17:08 strengejacke

For glm.nb, for instance, they simultate from the response and cut off the 95% interval from those values... doesn't look too complicated.

That just sounds like Krinsky-Robb. Are we sure those are really prediction intervals and not just confidence intervals around fitted values? If the latter, I don't personally see much value...

vincentarelbundock avatar Aug 26 '22 19:08 vincentarelbundock

There's an additional step using rnegbin(), so it's actually a bit different than just sims and quantiles.

strengejacke avatar Aug 26 '22 19:08 strengejacke

Yep, it's effectively putting a beta prior on the probability parameter

bwiernik avatar Aug 26 '22 23:08 bwiernik

Effectively its putting a beta prior on the p parameter

bwiernik avatar Aug 27 '22 12:08 bwiernik

Have you just repeated your last answer?

strengejacke avatar Aug 27 '22 13:08 strengejacke

Have you just repeated your last answer?

It was a very good answer; probably worth repeating ;)

vincentarelbundock avatar Aug 27 '22 13:08 vincentarelbundock

Stupid GitHub iOS app not showing things right 🤬

bwiernik avatar Aug 27 '22 17:08 bwiernik

Note to self (and interested lurkers):

I am tempted to recommend that people use conformal inference to compute prediction intervals. It seems super flexible, and it is much easier to implement than all the custom code that would be required to support classical prediction intervals for the wide array of models supported by marginaleffects.

Here's a short blog post on conformal inference using marginaleffects: https://arelbundock.com/posts/conformal/

vincentarelbundock avatar Sep 22 '22 13:09 vincentarelbundock

Is this something we could implement to enhance insight::get_predicted_ci()?

strengejacke avatar Sep 22 '22 15:09 strengejacke

Is this something we could implement to enhance insight::get_predicted_ci()?

Maybe, but I'm not convinced this is a good idea. Each model type (and predict/type value) may require a different conformity score function, and the best such function is likely use-case-specific.

vincentarelbundock avatar Sep 22 '22 15:09 vincentarelbundock