cmdstan icon indicating copy to clipboard operation
cmdstan copied to clipboard

id for each chain should be unique in multi chain output csvs

Open SteveBronder opened this issue 1 year ago • 2 comments

Summary:

Right now the id for each chain will always be the same in the output csv, but they should be different for each output file

Description:

Describe the issue as clearly as possible.

Reproducible Steps:

make examples/bernoulli/bernoulli
./examples/bernoulli/bernoulli sample num_chains=2 data file="./examples/bernoulli/bernoulli.data.R"

Current Output:

For model_2.csv as an example

# stan_version_major = 2
# stan_version_minor = 34
# stan_version_patch = 1
# model = bernoulli_model
# start_datetime = 2024-03-14 19:00:54 UTC
# method = sample (Default)
#   sample
#     num_samples = 1000 (Default)
#     num_warmup = 1000 (Default)
#     save_warmup = 0 (Default)
#     thin = 1 (Default)
#     adapt
#       engaged = 1 (Default)
#       gamma = 0.05 (Default)
#       delta = 0.8 (Default)
#       kappa = 0.75 (Default)
#       t0 = 10 (Default)
#       init_buffer = 75 (Default)
#       term_buffer = 50 (Default)
#       window = 25 (Default)
#       save_metric = 0 (Default)
#     algorithm = hmc (Default)
#       hmc
#         engine = nuts (Default)
#           nuts
#             max_depth = 10 (Default)
#         metric = diag_e (Default)
#         metric_file =  (Default)
#         stepsize = 1 (Default)
#         stepsize_jitter = 0 (Default)
#     num_chains = 2
# id = 1 (Default)
# data

Expected Output:

For model_2.csv

# stan_version_major = 2
# stan_version_minor = 34
# stan_version_patch = 1
# model = bernoulli_model
# start_datetime = 2024-03-14 19:00:54 UTC
# method = sample (Default)
#   sample
#     num_samples = 1000 (Default)
#     num_warmup = 1000 (Default)
#     save_warmup = 0 (Default)
#     thin = 1 (Default)
#     adapt
#       engaged = 1 (Default)
#       gamma = 0.05 (Default)
#       delta = 0.8 (Default)
#       kappa = 0.75 (Default)
#       t0 = 10 (Default)
#       init_buffer = 75 (Default)
#       term_buffer = 50 (Default)
#       window = 25 (Default)
#       save_metric = 0 (Default)
#     algorithm = hmc (Default)
#       hmc
#         engine = nuts (Default)
#           nuts
#             max_depth = 10 (Default)
#         metric = diag_e (Default)
#         metric_file =  (Default)
#         stepsize = 1 (Default)
#         stepsize_jitter = 0 (Default)
#     num_chains = 2
# id = 2
# data

Additional Information:

I swear I did this in the past, but I think what we need to do is pass an iterator for the chain number to parser.print so that when that function is printing the id we can override the value to be the chain id from within the stan program

Current Version:

v2.34.1

SteveBronder avatar Mar 14 '24 19:03 SteveBronder

This is really another variant of the bug (https://github.com/stan-dev/cmdstan/issues/1029) where if the fixed_param sampler is run when the model lacks parameters, it still reports running HMC: the output here is always the command line as requested, not the actual run information

WardBrian avatar Mar 14 '24 19:03 WardBrian

This same behavior also led to some confusion on the forums due to the inits argument being reported as the same in each file: https://discourse.mc-stan.org/t/cmdstanpy-supplying-multiple-paths-as-inits-writes-to-tmp/34968

WardBrian avatar May 01 '24 15:05 WardBrian