pyro icon indicating copy to clipboard operation
pyro copied to clipboard

[FR] get mcmc.summary() as DataFrame

Open noamsgl opened this issue 3 years ago • 6 comments
trafficstars

First of all thanks for the great package!

Issue Description

One of the useful features of the MCMC class is the MCMC.summary() method which prints summary statistics of the sampled chain.

In this issue I'm asking for the feature to be able to get this table as a Pandas DataFrame.

Code Snippet

def model(data):
    ...

nuts_kernel = NUTS(model)
mcmc = MCMC(nuts_kernel, num_samples=500)
mcmc.run(data)
mcmc.summary()

It would be very helpful if there was a simple way to get mcmc.summary as DataFrame or perhaps alternatively as a dict which can become a DataFrame.

noamsgl avatar Jan 26 '22 14:01 noamsgl

@noamsgl we don't have pandas as a dependency and i guess this isn't a sufficiently good reason to add it as one. i believe you can get the dictionary you want from this function. not entirely sure why it doesn't seem to be exposed in the docs...

martinjankowiak avatar Jan 26 '22 16:01 martinjankowiak

It should be fine to treat this as an optional dependency

def summary_dataframe(self):
    import pandas as pd
    ....

fritzo avatar Jan 26 '22 16:01 fritzo

@fritzo I'm a newbie to open source contributing but that idea sounds like it could work. Is that something you would advise me to try as a first contribution?

noamsgl avatar Jan 27 '22 00:01 noamsgl

Yes @noamsgl this would be a great first contribution! You'd need to

  • implement the function
  • add a docstring
  • add a smoke test (this should be easy since pandas is already a test dependency)

fritzo avatar Jan 27 '22 00:01 fritzo

thanks @fritzo @martinjankowiak for your replies. I'm working on implementation.

I want to use the function mcmc.infer.mcmc.util.summary() but I think I found a bug in it.

Steps to reproduce: https://colab.research.google.com/drive/1TnDfjuFPnl_iecgapbq0Oi6lrfP8xf08?usp=sharing

Is this a bug or just misuse?

noamsgl avatar Feb 02 '22 08:02 noamsgl

i believe you need group_by_chain=False; please refer to the docstring for details

martinjankowiak avatar Feb 02 '22 16:02 martinjankowiak