posteriordb
posteriordb copied to clipboard
Add structure for likelihood to compute predictive distributions based on draws
In model slot.
Would this be an additional slot in the model info file like blr.info.json or in a model code file like blr.stan?
Im actually starting to think of this as a separate stan function for now (a separate stan file). Then we can add it along the way.
So you're thinking that blr.info.json would be something like this?
{
"name": "blr",
"keywords": [],
"title": "A Bayesian linear regression model with vague priors",
"description": "A Bayesian linear regression model with vague priors.",
"urls": [],
"model_code": {
"stan": "models/stan/blr.stan"
},
"likelihood_code": {
"stan": "models/stan/blr_likelihood.stan"
},
"references": null,
"added_date": "2019-11-29",
"added_by": "Mans Magnusson"
}
Yes. Exactly!
With PyMC there is no need for any extra code to compute predictive distributions
posterior_predictive_samples = pymc.sample_posterior_predictive(draws, model=model)
where model is any pymc model and draws are the posterior draws computed with pymc. Source: https://docs.pymc.io/notebooks/posterior_predictive.html
It seems backwards to include likelihood_code in the model info file when not all model implementations will require it.
Ideal solution
From my perspective the ideal solution seems like this
- change
blr.info.jsonto this
{
"name": "blr",
"model_implementations": {
"stan": "models/stan/blr.info.json",
"pymc": "models/pymc/blr.info.json"
}
}
(some slots like keywords have been omitted for the sake of clarity, but they still would remain here)
models/stan/blr.info.jsonwould be
{
"code": "models/stan/blr.stan",
"likelihood": "models/stan/blr_likelihood.stan"
}
models/pymc/blr.info.jsonwould be
{
"code": "models/pymc/blr.py"
}
I agree that this is probably the right way to go. Well spotted.
How should we expose the likelihood/predictive distribution to users?
In other words, if I have a posterior object po <- posterior("8_schools"), what would the API be like to access the likelihood/predictive distribution?
Would it be something like stan_predictive_draws(po, posterior_draws)? Would we also want something like stan_likelihood(po, posterior_draws)? What would this return?
These could be generalized to predictive_draws(po, posterior_draws, framework = "stan") and the same for stan_likelihood.
Yes, that would probably be a good idea
Write a suggestion for the 8-schools example of how it should look and review it with Paul.
The structure change suggested by @eerolinna needs to be done before @jarnefeltoliver starts to fill out the db.
I have now implemented it, but I think it is simpler to just add a JSON object directly:
{
"name": "blr",
"model_implementations": {
"stan": {
"model_code": "models/stan/blr.stan",
"likelihood_code": "models/stan/blr_likelihood.stan"}
"pymc": {
"model_code": "models/pymc/blr.py"
}
}
}
That's a good idea!
Do we want to have an API that exposes the likelihood code? Something like
stan_likelihood_code_file(po)
that would for the blr posterior return models/stan/blr_likelihood.stan.
Or is it sufficient that we have something like
posterior_predictive_draws(po, framework = "stan")
that essentially keeps the likelihood code file as an implementation detail instead of public API?
Arguments for keeping the likelihood code file out of the public API
- Smaller API surface is easier to learn, understand and maintain
- If Stan ever adds a way to automatically compute predictive draws without a separate likelihood definition we can remove the likelihood code files without breaking anyone's code
Arguments for adding the likelihood code to the public API
- Someone might need the likelihood code outside of computing predictive draws?
We would like to be able to produce the predictive distribution using stan likelihood file without any restrictions on how the posterior was computed. So in R we would have a function that uses the likelihood file but returns a predictive distribution. Although this does not necessarily need to expose the likelihood code as you say. I . have no good answer yet.