pymc-resources icon indicating copy to clipboard operation
pymc-resources copied to clipboard

Rethinking_2: trace_to_dataframe() deprecation, move to InferenceData

Open joannadiong opened this issue 3 years ago • 3 comments

trace_to_dataframe() in PyMC3 to save traces is currently implemented in Rethinking_2 notebooks (e.g. Chp_04). But the function is planned for deprecation, with Arviz being the intended package to save traces. As per this comment by @AlexAndorra, Arviz's InferenceData format is a superior replacement to this function as it can handle multidimensional data natively, with associated names instead of axis number alone. In future, all instances of trace_to_dataframe() in the Notebooks would need to be updated.

I attempted to update the following code section in Chp_04. I think the code below is part of the solution, but I can't work out how to get the covariance estimates needed for this section. I'm happy for a Developer to be assigned to fix this. Or I could be assigned and attempt to fix it if Alex or a fellow Dev could provide guidance, since I am new to Bayes and PyMC3. Happy to discuss, and thanks for the support!

# Code 4.32 
with pm.Model() as model: 
    pm_data = az.from_pymc3(trace=trace_4_1)

joannadiong avatar Jan 24 '21 06:01 joannadiong

Thanks @joannadiong ! Yeah, what you're laying out sounds like a good plan. If you want, what we can do is that you open a WIP PR implementing the changes you have in mind and that work, and you also flag the parts where you're more hesitant or can't find how to do. Sounds good?

AlexAndorra avatar Jan 24 '21 10:01 AlexAndorra

Hi @AlexAndorra , thanks for the speedy reply. Sounds like a good approach. Will keep this on my TODO list and hope to be in touch when ready. Cheers!

joannadiong avatar Jan 25 '21 09:01 joannadiong

I tried this.

idata = az.from_pymc3(trace_4_1, model=m4_2)
trace_df = idata.posterior.to_dataframe()
trace_df.cov()

But it ends up with error.

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
/var/folders/ty/2zf3422s5bd91_vnjm_3wlqh0000gn/T/ipykernel_1341/434011363.py in <module>
----> 1 idata = az.from_pymc3(trace_4_1, model=m4_2)
      2 trace_df = idata.posterior.to_dataframe()
      3 trace_df.cov()
      4 # trace_df = pm.trace_to_dataframe(trace_4_1)
      5 # trace_df.cov()

~/opt/miniconda3/envs/pymc3/lib/python3.9/site-packages/arviz/data/io_pymc3_3x.py in from_pymc3(trace, prior, posterior_predictive, log_likelihood, coords, dims, model, save_warmup, density_dist_obs)
    578     InferenceData
    579     """
--> 580     return PyMC3Converter(
    581         trace=trace,
    582         prior=prior,

~/opt/miniconda3/envs/pymc3/lib/python3.9/site-packages/arviz/data/io_pymc3_3x.py in __init__(self, trace, prior, posterior_predictive, log_likelihood, predictions, coords, dims, model, save_warmup, density_dist_obs)
     73         density_dist_obs: bool = True,
     74     ):
---> 75         import pymc3
     76 
     77         try:

ModuleNotFoundError: No module named 'pymc3'

Both PyMC and arviz are the latest version. Running on PyMC v4.0.0b2 Running on arviz v0.11.4 I have to say that it is a really bad idea to change the name of a library :(

DayuanJiang avatar Feb 28 '22 15:02 DayuanJiang