posteriordb Add Tensorflow probability and Pyro example for 8-schools

Test to give an additional PPF framework in the database.

Dec 12 '19 07:12 MansMeg

If time, also try add Pyro, JAGS and PyMC for 8-schools (but is low prio)

Dec 19 '19 10:12 MansMeg

Here's a pymc3 model

import pymc3 as pm3

def model(data):
    J = data["J"] # number of schools
    y_obs = data["y"] # estimated treatment
    sigma = data["sigma"] # std of estimated effect
    with pm3.Model() as pymc_model:

        mu = pm3.Normal('mu', mu=0, sd=5) # hyper-parameter of mean, non-informative prior
        tau = pm3.Cauchy('tau', alpha=0, beta=5) # hyper-parameter of sd 
        theta_trans = pm3.Normal('theta_trans', mu=0, sd=1, shape=J)
        theta = mu + tau * theta_trans
        y = pm3.Normal('y', mu=theta, sd=sigma, observed=y_obs)
    return pymc_model

To run it

import numpy as np
J = 8 
y = np.array([28,  8, -3,  7, -1,  1, 18, 12]) 
sigma = np.array([15, 10, 16, 11,  9, 11, 10, 18])                                
data = {"J": J, "y": y, "sigma": sigma}
m = model(data)
trace = pm3.sample(1000, model=m)

I tested that it works.

This is adapted from https://sharanry.github.io/post/eight-schools-model/ The model there had log_tau (maybe it gives better performance with pymc or perhaps it was just preference), I changed that to just tau.

I could make a PR with this.

Dec 19 '19 16:12 eerolinna

Great! Two thing.

We want this to be a stan alone py file for the model I guess?
We want to be able to run it by pointing to the posteriordb json. Ideally it should be possible to run from R as well. Im not sure how to include the model/pymc in the best way?

Dec 19 '19 17:12 MansMeg

My PR #118 (sorry forgot to reference this issue in it, did that now) is probably the best way to include the pymc model with the current design.

I'm not fully sure what you are asking in 2. Do you mean that will the users be able to call code <- model_code(po, framework = "pymc3") and then use the result to run the model? Or do you mean that posteriordb would have something like run_posterior(po, framework = "pymc3")?

Dec 19 '19 18:12 eerolinna

Oh I was not super clear. I meant to point to the json data. Ie how would an call using the posteriordb look like?

Dec 20 '19 06:12 MansMeg

Oh you mean the actual data? I'll first show how it would look like using python as R will have to go through python anyway.

import importlib
import pymc3
from posteriordb import PosteriorDatabase
my_pdb = PosteriorDatabase()
posterior = my_pdb.posterior("eight_schools")

python_code_path = posterior.model.code_file_path("pymc3")
python_lib = importlib.import_module(python_code_path)

data = posterior.data.values()

model = python_lib.model(data)
draws = pymc3.sample(model=model)

A potential way to use it from R could be

po <- posterior("eight_schools", pdb = pd)
dat <- get_data(po)
pymc_code_path <- model_code(po, "pymc3")
draws <- run_pymc(pymc_code_path, dat)

where run_pymc would essentially call a python command line script that takes model code path and data json, runs the inference and output the result as json (or something else).

Of course run_pymc(po) would be doable as well as we can get code path and data from the posterior.

I didn't comment any of the code, if you want clarifications feel free to ask!

Dec 20 '19 12:12 eerolinna

Yes. This looks exactly how I would like to have it!

Although, I think run_pymc3 should probably not be part of posteriordb.

Dec 20 '19 14:12 MansMeg

I agree that run_pymc3 doesn't have to be in posteriordb.

Dec 20 '19 15:12 eerolinna

I will start out by including 8-schools for all PPFs:

Pyro: https://forum.pyro.ai/t/hierarchical-models-and-eight-schools-example/362

JAGS: http://rstudio-pubs-static.s3.amazonaws.com/15236_9bc0cd0966924b139c5162d7d61a2436.html

Tensorflow Probability: https://github.com/tensorflow/probability/blob/master/tensorflow_probability/examples/jupyter_notebooks/Eight_Schools.ipynb

PyMC (see Eeros stuff above).

Jan 03 '20 13:01 MansMeg

Alright. When implementing JAGS Iusing run_jags() I realized that we need to automatically check that all models are equal with respect to the log density for a set of say 100 parameter values. This can be a little tricky between models implemented in R respectively Python. Especially since we want the tests to be run on travis every time.

Ideally we would like to have logp(po, params, "stan"), logp(po, params, "tfp") etc.

May 13 '20 05:05 MansMeg

I promised to clarify the issue for @mtwest2718

So we want to include two things with this issue:

Include Tensorflow code and Pyro code for the 8-schools model
Check that the log density for the models is identical (or proportional) between TFP, Pyro, PyMC3 and Stan for say 100 parameter draws from the reference posterior.

Maybe also include the test in 2. as a continuous integration test. I'm not sure exactly how to do that so that is probably a separate issue since it involves including PyMC3, Pyro, TFP etc in Travis, something that might become a maintenance hell.

Jul 05 '20 06:07 MansMeg

posteriordb posteriordb copied to clipboard

Add Tensorflow probability and Pyro example for 8-schools

posteriordb
posteriordb copied to clipboard