ArviZ.jl icon indicating copy to clipboard operation
ArviZ.jl copied to clipboard

Implement stats and diagnostics in new lightweight Julia package

Open sethaxen opened this issue 3 years ago • 3 comments

ArviZ's goal is to "...provide backend-agnostic tools for diagnostics and visualizations of Bayesian inference...". I would add that we want everyone to critique their models with the best available algorithms. Not every PPL user is going to use ArviZ.jl, but we want them to all use our algorithms.

A very Julian way to handle this problem is to reimplement some subset of the stats and diagnostics functionality in the Python arviz package in a lightweight, standalone Julia package, which we can then depend on. Then packages like MCMCChains or Soss could depend on this package to provide any custom overloads for their sample storage types, so that their users are directly encouraged to use these algorithms.

While such a package is within the scope of other PPLs (see https://github.com/TuringLang/MCMCChains.jl/issues/266), arviz has a robust developer community devoted explicitly to implementing/maintaining the best available algorithms for these tasks (as of this writing, arviz has had 80 contributors, the same number as Turing and Soss combined). So there is a good argument for the project being hosted by the arviz-devs org and used by Julia's PPLs.

sethaxen avatar May 17 '21 21:05 sethaxen

This is great! I've had some discussions recently with the Turing team about refactoring things so there's less code overlap between AbstractMCMC/MCMCChains and SampleChains. MCMCChains has some nice diagnostics, but it would be much better to have this abstracted into something more general that's usable for other data structures.

cscherrer avatar May 17 '21 21:05 cscherrer

The diagnostics in MCMCChains already work with AbstractArray, so they do not require the MCMCChains.Chains data structure and should already be generally usable. However, I guess it would be nice to avoid some of the dependencies that are caused only by the Chains part of the package. I don't think such a diagnostics package will be completely lightweight since also the diagnostics require quite some dependencies but it will definitely be nice to avoid the AxisArrays dependency.

Ref: https://github.com/TuringLang/MCMCChains.jl/issues/266

devmotion avatar May 17 '21 23:05 devmotion

I assumed it's best to just try and see how lightweight or heavy the diagnostic part of MCMCChains is: https://github.com/TuringLang/MCMCChains.jl/issues/266#issuecomment-843001213

devmotion avatar May 18 '21 09:05 devmotion

All diagnostics are now in MCMCDiagnosticTools.jl. All stats except kde are now here in the ArviZStats module (or re-exported by it in the case of PSIS), which will be split out into its own package.

sethaxen avatar Aug 02 '23 15:08 sethaxen