cmdstanpy icon indicating copy to clipboard operation
cmdstanpy copied to clipboard

[Stan 2.34] expose sampler argument "save_metric"

Open mitzimorris opened this issue 2 years ago • 9 comments

Summary:

Add functionality to save metric as JSON to CmdStanPy.

Description:

CmdStan PR https://github.com/stan-dev/cmdstan/pull/1203 added new adaptive sampler argument save_metric. Make this available from CmdStanPy:

  • Expose adaptive sampler argument save_metric
  • Add methods to CmdStanMCMC to access the saved metric and stepsize.
  • Document use and use cases.

Current Version:

This will be available in 3.34

mitzimorris avatar Oct 28 '23 20:10 mitzimorris

Should we expose it, or just always enable it by default?

The files are never that large relative to the output csvs

WardBrian avatar Oct 30 '23 14:10 WardBrian

always enable by default. what are we doing with save_config? we should be consistent.

mitzimorris avatar Oct 30 '23 17:10 mitzimorris

I think save_cmdstan_config will always be enabled if it is supported. Eventually I think it makes sense to move essentially all the config logic to reading that file and then ignore the comments in the output csvs entirely.

WardBrian avatar Oct 30 '23 18:10 WardBrian

We still want to add this, correct? It would be fairly straightforward to do so in line with the other output files we're tracking.

With this enabled, we would then replace the current CSV parsing for stepsize and metric with the content of these files?

amas0 avatar Nov 17 '25 20:11 amas0

With this enabled, we would then replace the current CSV parsing for stepsize and metric with the content of these files?

Yes, exactly. We’d ideally like to get to the point where the only thing we read from the csv file is the draws, ignoring all the other comments

WardBrian avatar Nov 17 '25 21:11 WardBrian

I have been taking a look at this today and have a question.

When we do something like perform a CmdStanModel.generate_quantities(..., previous_fit=[files]), where files is a list of stan csv file outputs from a previous MCMC run, we create a new CmdStanMCMC object from these files. As we are transitioning to not parsing all of the information from the stan csv files, we can't reliably guarantee that the corresponding metric.json files are available as well when making the underlying from_csv call.

How should we handle the metric information in that case? Should we try to lazily load it so you can create the CmdStanMCMC object from the csv, but if you try to access the metric info it will throw an exception? Should we just make those properties fault tolerant and perhaps just throw a warning if they can't be found? Or just have them be None when we don't have access to the JSON files?

amas0 avatar Nov 18 '25 21:11 amas0