cmdstanpy Enable `save_metric=1` and sources MCMC metric info from new JSON file

Submission Checklist

[x] Run unit tests
[x] Declare copyright holder and open-source license: see below

Summary

This PR enables save_metric=1 for cmdstan sampling with adaptation. This outputs a new metric JSON file that contains the step size, inv_metric, and metric type for the sampling run. This PR also now lazily sources this info from these files when the corresponding properties on the CmdStanMCMC objects are accessed. By changing the source of this info, we can now also remove all the code where we were parsing adaptation info from the Stan CSV file, which this PR does.

Importantly, this PR proposes adding pydantic as a dependency to cmdstanpy. With sourcing more info from JSON files (in this metric change and soon from the config files), we want to be more careful about I/O validation. Pydantic is now a fairly standard way to parse and validate data in the Python ecosystem and I think is a good fit for our needs.

Closes #713.

@WardBrian In the issue comments, we discussed an update to the save_csvfiles methods that would include the other output files. This PR doesn't include that change. I could work to incorporate it here, but I think it makes sense to look at that as a standalone PR.

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): myself

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)

Dec 02 '25 21:12 amas0

I'll need some time to look over this but at first glance it looks pretty good. I don't think Pydantic would make a ton of sense for this alone, but as you note we will probably want something more powerful when loading the config files.

I do worry about being too strict about IO, since it could mean that cmdstanpy becomes unusable with development versions or right when a new version is released before cmdstanpy has itself updated. Right now this will most-likely 'just work' as long as you don't request information related to the part of the config that is new/different in that cmdstan version, but if we were doing full validation it would fail even if you only actually need some unrelated piece

Dec 03 '25 19:12 WardBrian

Yeah if it were just this, I wouldn't be interested in adding the dependency, but with potentially quite a few output files that need to be parsed, I think it would be useful. We could always implement the equivalent structures without pydantic, would just be a bit extra work.

Good point about strictness -- perhaps we could have validation issues throw warnings instead of errors? And only be strict where it would indicate something has gone wrong

Dec 03 '25 19:12 amas0

(assuming those test failures are just noise?)

Dec 17 '25 16:12 WardBrian

No worries!

There was one test failure that was numerical. That was just noise.

I'm scratching my head on these Windows failures, though. I tried to replicate in a VM locally, and they were all passing. Have you seen anything like this before?

Dec 17 '25 16:12 amas0

I wouldn't be shocked if there was some weird difference, but it's hard to tell with just the test output as it is. Maybe print the metric files, and put [] inside the all call so it's a list comprehension rather than a generator expression? That will make it more obvious which index is the problem

Dec 17 '25 16:12 WardBrian

:shrug:

I guess we're good

Dec 17 '25 17:12 amas0

huh, alright!

Dec 17 '25 17:12 WardBrian