posterior icon indicating copy to clipboard operation
posterior copied to clipboard

Interface for ensembles

Open mjskay opened this issue 1 year ago • 4 comments

(I'm not sure where this lives, so I'm putting it here because at least I know the relevant folks will see it and then we can discuss ;).)

I think it might be helpful to have a lightweight class for ensembles of models with weights. Perhaps something that contains a list of models and weights, and implements calls to most of the generic model functions (posterior_predict(), etc; basically the list here: #39) by resampling output from the model list according to those weights.

Unless I've missed something, current approaches to this seem to be duplicated across several packages (loo, brms, marginaleffects, ...); see some discussion on this twitter thread.

The advantage of what I am proposing is that packages that build on top of the generic model functions (#39) would then automatically support ensembles. Currently such packages have to be "ensemble-aware" in a non-generic way; e.g. for ensembles of brms models they have to use brms::pp_average() instead of posterior_predict() / etc.

Not sure where such a class should live (here? rstantools?). I guess this partly depends on if the generic functions from rstantools eventually get moved here per #39.

Thoughts? Pinging @avehtari @paul-buerkner @jgabry.

mjskay avatar Aug 15 '22 16:08 mjskay

Ensemble posterior could be for the parameters or for the predictions (the latter discussed today in twitter).

The first case is easier, as some function in posterior could be given a set of draws objects and corresponding weights. These could be merged to a draw object with weights, which could be transformed to an unweighted draws object with resample_draws(). To save memory it would be possible to use resample_draws to resample just the index and then copy only the selected draws from the given set of draw objects. This first case would work also for models written directly in Stan with generated quantities generating yrep.

Currently posterior doesn't know anything about models or functions to map from the parameter posterior to predictions, but then if I understood correctly @mjskay you suggest that #39 could define common API for prediction related functions and then this could be extended also for ensembles. If the function is given the models, then it would be also possible to first use resample_draws to resample just the index and then ask each model to predict only the necessary number of draws.

avehtari avatar Aug 15 '22 16:08 avehtari

Thanks for opening this @mjskay! The current ensemble implementation for models of class brmsfit in {marginaleffects} is built on top of brms::pp_average since it relies on expectations for stacked/model averaged AMEs and contrasts. If you wanted model averaged/stacked parameters, you would use brms::posterior_average instead.

If there was a common API for ensemble predictions it would make it substantially easier to extend support to other model classes (i.e., stanreg) in the {marginaleffects} implementation and allow packages such as tidybayes to build upon it to streamline post-estimation ensemble predictions, the latter of which I provide a brief example of in the the twitter thread.

ajnafa avatar Aug 15 '22 18:08 ajnafa

The first case is easier, as some function in posterior could be given a set of draws objects and corresponding weights. These could be merged to a draw object with weights, which could be transformed to an unweighted draws object with resample_draws(). To save memory it would be possible to use resample_draws to resample just the index and then copy only the selected draws from the given set of draw objects. This first case would work also for models written directly in Stan with generated quantities generating yrep.

Makes sense --- maybe mix_draws(list_of_draws, weights)?

Currently posterior doesn't know anything about models or functions to map from the parameter posterior to predictions, but then if I understood correctly @mjskay you suggest that #39 could define common API for prediction related functions and then this could be extended also for ensembles. If the function is given the models, then it would be also possible to first use resample_draws to resample just the index and then ask each model to predict only the necessary number of draws.

Right. For the interface I'm thinking something like m = mix_models(list_of_models, weights) (or ensemble(...) or ...?) returning m with class "model_mixture" and then we provide implementations of posterior_predict.model_mixture() and so on. One complication here is if different parameters need to be sent to different models for some functions (might need to provide a way to specify this in mix_models()) but I think that's solvable.

mjskay avatar Aug 16 '22 04:08 mjskay

maybe mix_draws(list_of_draws, weights)?

mix_draws(list_of_draws, weights = NULL, method = "stratified", ndraws = NULL, ...)? (compare to resample_draws())

This would possibly be useful also for stacking chains https://jmlr.org/papers/v23/20-1426.html

avehtari avatar Aug 16 '22 07:08 avehtari