cmdstanpy icon indicating copy to clipboard operation
cmdstanpy copied to clipboard

dynamic traceplot for jupyter notebooks

Open mitzimorris opened this issue 5 years ago • 11 comments

Summary:

It would be super cool to provide a dynamic traceplot of the value of lp__ (total joint log probability for the model) across all chains during warmup and sampling - is this possible?

Description:

showed Andrew Gelman Colab jupyter notebooks (R version) and he asked for this dynamic visualization. we could pull this off the output files similar to the way that the tqdm progress bar monitors standard out. given the current Stan csv format, the first column is the lp__ - the total joint log probability. generally, when lp__ converges, all params have converged, conversely, if any params have not converged, lp__ will not converge either, so this would be good enough

we want to discourage folks from trying to monitor a particular param - there are other more appropriate tools that should be used - so simple feature - just monitor lp__ across chains!

Additional Information:

Current Version:

mitzimorris avatar Jan 13 '20 23:01 mitzimorris

@ahartikainen and @evgenyneu - thoughts? is this feasible?

mitzimorris avatar Jan 13 '20 23:01 mitzimorris

Like bokeh + streaming?

https://atomar94.github.io/real-time-streaming-plots-with-python-and-bokeh/

This will be possible with ArviZ at somepoint when I have time to implement it.

ahartikainen avatar Jan 14 '20 00:01 ahartikainen

A proof of concept :)

https://gitlab.com/evgenyneu/2019_logbook/tree/master/a2020/a01/a14_cmdstanpy_probability_plot/code

From Terminal

https://youtu.be/qGrPKlw3ytA

From Jupyter

live_probability_plot_jupyter

evgenyneu avatar Jan 15 '20 05:01 evgenyneu

super cool! but I think what Andrew was asking for was to superimpose all chains so that you get the classic "fuzzy catapiller" when things converge - here's an example where someone showed traceplots of 3 params - but we just want "lp__" - the total joint log probability of the model.

image

mitzimorris avatar Jan 15 '20 14:01 mitzimorris

This is great! Even better that it runs within a Jupyter notebook.

The one stat we should trace is lp__, the log density (up to a proportion). The reason is that HMC tends to mix slowly across log densities compared to in the parameters themselves, so when lp__ has mixed, the parameters have typically mixed.

What @andrewgelman is after is a binary traceplot of two variables. The draws are (x(n,t), y(n,t)) for draw t on chain n and lines would connect (x(n,t), y(n,t)) to (x(n,t+1), y(n, t+1)). And that would also be done with multiple chains, each rendered in a different color.

bob-carpenter avatar Jan 15 '20 20:01 bob-carpenter

I think that is doable with ArviZ. I just need to add some extra options and then wrapper for sampler (e.g. PyStan3 could do this without reading any files)

https://arviz-devs.github.io/arviz/examples/bokeh/bokeh_plot_pair.html

ahartikainen avatar Jan 15 '20 20:01 ahartikainen

Here is a fuzzy caterpillar concept :)

https://gitlab.com/evgenyneu/2019_logbook/tree/master/a2020/a01/a15_fuzzy_caterpillar/code

Terminal: https://youtu.be/f6BpCAHXCvY

Jupyter: https://youtu.be/Ku9JsJAClgI

evgenyneu avatar Jan 16 '20 11:01 evgenyneu

That's really great. Thanks for sharing.

On Jan 16, 2020, at 6:40 AM, Evgenii Neumerzhitckii [email protected] wrote:

Here is a fuzzy caterpillar concept :)

https://gitlab.com/evgenyneu/2019_logbook/tree/master/a2020/a01/a15_fuzzy_caterpillar/code

Terminal: https://youtu.be/f6BpCAHXCvY

Jupyter: https://youtu.be/Ku9JsJAClgI

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

bob-carpenter avatar Jan 16 '20 17:01 bob-carpenter

we want to discourage folks from trying to monitor a particular param - there are other more appropriate tools that should be used

why is it just lp? I frequently find a trace of tree depth useful, and have a window with watch "cat headers; cut -f1-8 -d, < model.csv | tr , \\\t | column -t | tail -n20" running just to see what's happening. What are these other tools that might do something similar?

maedoc avatar Dec 05 '21 17:12 maedoc

Hello, I do realize that this is an old thread which I have stumbled by accident, but on making a similar question in the Stan forums user mitzimorris, who is also in this thread, pointed me to this: https://github.com/flatironinstitute/mcmc-monitor I have no experience with this, but given it's running on web technologies, it should be feasible to adapt to a Jupyter notebook, right?

jpmvferreira avatar Aug 10 '23 11:08 jpmvferreira

@maedoc: MCMC monitor lets you monitor pretty much everything---it's like an online version of ShinyStan (an R interface in Stan to analyze posteriors).

@jpmvferreira : MCMC monitor is a standalone web server that uses a web browser to provide its own interface. I do not think it will be easy to use with Jupyter.

bob-carpenter avatar Aug 10 '23 16:08 bob-carpenter