ArviZ.jl
ArviZ.jl copied to clipboard
Implement stats and diagnostics in new lightweight Julia package
ArviZ's goal is to "...provide backend-agnostic tools for diagnostics and visualizations of Bayesian inference...". I would add that we want everyone to critique their models with the best available algorithms. Not every PPL user is going to use ArviZ.jl, but we want them to all use our algorithms.
A very Julian way to handle this problem is to reimplement some subset of the stats and diagnostics functionality in the Python arviz package in a lightweight, standalone Julia package, which we can then depend on. Then packages like MCMCChains or Soss could depend on this package to provide any custom overloads for their sample storage types, so that their users are directly encouraged to use these algorithms.
While such a package is within the scope of other PPLs (see https://github.com/TuringLang/MCMCChains.jl/issues/266), arviz has a robust developer community devoted explicitly to implementing/maintaining the best available algorithms for these tasks (as of this writing, arviz has had 80 contributors, the same number as Turing and Soss combined). So there is a good argument for the project being hosted by the arviz-devs org and used by Julia's PPLs.
This is great! I've had some discussions recently with the Turing team about refactoring things so there's less code overlap between AbstractMCMC/MCMCChains and SampleChains. MCMCChains has some nice diagnostics, but it would be much better to have this abstracted into something more general that's usable for other data structures.
The diagnostics in MCMCChains already work with AbstractArray, so they do not require the MCMCChains.Chains data structure and should already be generally usable. However, I guess it would be nice to avoid some of the dependencies that are caused only by the Chains part of the package. I don't think such a diagnostics package will be completely lightweight since also the diagnostics require quite some dependencies but it will definitely be nice to avoid the AxisArrays dependency.
Ref: https://github.com/TuringLang/MCMCChains.jl/issues/266
I assumed it's best to just try and see how lightweight or heavy the diagnostic part of MCMCChains is: https://github.com/TuringLang/MCMCChains.jl/issues/266#issuecomment-843001213
All diagnostics are now in MCMCDiagnosticTools.jl. All stats except kde are now here in the ArviZStats module (or re-exported by it in the case of PSIS), which will be split out into its own package.