Proposal: General Python improvements, compatibility with cmdstanpy, and packaging
Howdy, really appreciate having these diagnostics available. I am a cmdstanpy user and found it pretty straightforward to translate over the extract_ functions to be compatible.
While going through the code, I thought there were a few areas that could be improved. Some initial thoughts:
- There are some minor errors that I assume slipped through the cracks when translating these diagnostics from R to Python, e.g. there are some undefined variables, a
NaNthat should probably be anp.NaN, etc. Didn't hit any branches that caused errors when I ran the diagnostics myself, but either way those fixes would be easy. - Building a general compatibility for both PyStan and cmdstanpy -- since you've already done most of the work with the
extract_*functions, this could be as simple as implementing these functions for both thestan.StanFitobjects and thecmdstanpy.stanfit.mcmc.CmdStanMCMCobjects. This wouldn't be much work and would make it for any Python Stan users to take advantage of this diagnostic code. - Packaging up the code base and making it PyPI installable. Intention here is just to make it so a Python Stan user could
pip install mcmc-diagnosticsor whatever the name should be and quickly incorporate these diagnostics into their workflow. I know this repo has a mix of R, PyStan2, PyStan3 implementations as well as documentation, so I'm not sure exactly if you'd want to separate things out or not.
With all of these suggestions, I would be happy to work on implementing them if you are at all interested in doing so. Personally, I think it would be great to have a diagnostic utility like as something that can be easily incorporated into one's workflow.
If you're interested in pull requests or have any particular desires for future direction here, I'd be interested in hearing about them.
-
Comments/pull requests addressing any stray code, outright bugs, or even recommendations for Pythonic convention consistency (for example in some of the error checking but in general anywhere) are all welcome and appreciated!
-
I will not be supporting CmdStanPy directly. The main issue is maintaining a small enough scope that I can manage while also making it clear how to generalize to other interfaces. The code is intentionally generously licensed so that people can wrap it into self-contained implementations for any other interfaces.
-
Right now I very much consider this alpha-level code with parts of the API potentially unstable and hence unsuitable for packaging and dependence-capture. This may change in the future as I use it more and more for teaching, and ideally get feedback from external uses. For the time being the generous licensing allows anyone to wrap the code into a package themselves with the caveat that they have two subsume the maintenance burden.