Add a function for generating initial values from a CmdStanMCMC object, csv files or a draws object
Submission Checklist
- [X ] Run unit tests
- [ X] Declare copyright holder and agree to license (see below)
Summary
I initially built these functions for another package I'm working on last week. But yesterday I saw #876, so I decided to share them. I did see @SteveBronder is already working on it, but since I had them ready, here's a PR for them - if you decide to go with the other approach, no worries!
This PR adds a generic method generate_inits which creates a lists of lists that can be used to pass to the init argument for sampling.
There are three specific methods for generating inits from CmdStanMCMC object, a vector of csv files with draws output, or a draws object.
The methods each takes as arguments:
- variables: specify which variables to generate inits for
- FUN: a function that takes a vector of draws for a parameter and outputs a single value. The default is to sample 1 random value from the draws, but it can be any function such as mean, median, etc
- The method for CmdStanMCMC also takes an argument
drawswith options "last", "sampling", "all". If the argument is "last", it uses a new function I wroteread_last_drawswhich retrieves the last draw recorded from the csv files. This does not use the builting read_cmdstan_csv function, but a similar method. This can be useful for very large fits - it takes 0-2 seconds regardless of the size of the files
The general logic is that for each chain, it:
- retrieves the draws
- applies the function specified by FUN to each parameter
- reformats the result into a list of lists by infering the dimensions for each parameter
All the methods are documented, and include unit tests.
Examples
inits from a CmdStanMCMC object
stanfit <- cmdstanr::cmdstanr_example("logistic")
inits1 <- generate_inits(stanfit)
inits2 <- generate_inits(stanfit, FUN = mean)
inits3 <- generate_inits(stanfit, FUN = quantile, probs = 0.5)
inits4 <- generate_inits(stanfit, draws = "last")
str(inits1)
#> List of 4
#> $ :List of 2
#> ..$ alpha: num 0.469
#> ..$ beta : num [1:3(1d)] -0.776 -0.423 0.393
#> $ :List of 2
#> ..$ alpha: num 0.22
#> ..$ beta : num [1:3(1d)] -0.334 -0.162 0.475
#> $ :List of 2
#> ..$ alpha: num 0.153
#> ..$ beta : num [1:3(1d)] -0.692 -0.331 0.463
#> $ :List of 2
#> ..$ alpha: num 0.396
#> ..$ beta : num [1:3(1d)] -0.7717 -0.0398 0.3815
inits from a vector of file paths
files <- stanfit$output_files()
generate_inits(files)
inits from a draws object
draws <- stanfit$draws()
generate_inits(draws)
warmup and then use the final draws for the inits of a separate sampling stage
warmup <- cmdstanr_example("logistic", parallel_chains = 4, iter_sampling = 0, save_warmup = T)
inits <- generate_inits(warmup, draws = "last")
mod <- cmdstan_model(exe_file = warmup$runset$exe_file())
fit <- mod$sample(warmup$data_file(),
parallel_chains = 4,
init = inits,
iter_warmup = 0,
inv_metric = warmup$inv_metric(matrix = FALSE),
step_size = warmup$metadata()$step_size_adaptation,
adapt_engaged = FALSE)
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
lp__ -66.06 -65.69 1.56 1.33 -69.06 -64.27 1.00 2020 2383
alpha 0.38 0.38 0.22 0.22 0.02 0.76 1.00 4141 2964
beta[1] -0.67 -0.66 0.25 0.25 -1.10 -0.26 1.00 4307 3175
beta[2] -0.27 -0.27 0.23 0.23 -0.66 0.10 1.00 3956 3006
beta[3] 0.68 0.67 0.28 0.27 0.23 1.14 1.00 4183 2273
log_lik[1] -0.51 -0.51 0.10 0.10 -0.69 -0.36 1.00 4143 2797
log_lik[2] -0.40 -0.38 0.15 0.14 -0.68 -0.19 1.00 4318 2762
log_lik[3] -0.50 -0.46 0.22 0.21 -0.92 -0.21 1.00 4368 3021
log_lik[4] -0.45 -0.43 0.16 0.15 -0.74 -0.23 1.00 3889 3043
log_lik[5] -1.19 -1.17 0.29 0.28 -1.68 -0.76 1.00 4594 2657
# compare with standard fitting with combined warmup and sampling
fit_standard <- mod$sample(warmup$data_file(),
parallel_chains = 4)
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
lp__ -66.01 -65.69 1.48 1.29 -68.81 -64.28 1.00 2186 3008
alpha 0.38 0.38 0.22 0.22 0.02 0.75 1.00 4170 2867
beta[1] -0.67 -0.67 0.25 0.24 -1.09 -0.27 1.00 3536 3062
beta[2] -0.27 -0.27 0.23 0.22 -0.65 0.11 1.00 4264 3301
beta[3] 0.69 0.69 0.27 0.28 0.25 1.15 1.00 4175 3179
log_lik[1] -0.51 -0.51 0.10 0.10 -0.69 -0.36 1.00 4055 3100
log_lik[2] -0.40 -0.38 0.15 0.14 -0.67 -0.20 1.00 4626 3386
log_lik[3] -0.49 -0.46 0.22 0.20 -0.90 -0.20 1.00 4306 3164
log_lik[4] -0.45 -0.43 0.16 0.15 -0.73 -0.23 1.00 3885 2900
log_lik[5] -1.20 -1.18 0.29 0.28 -1.71 -0.76 1.00 4678 3246
Copyright and Licensing
Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Vencislav Popov
By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
This doesn't have a method for pathfinder, but since generate_inits is a generic, a method could be added if you like this approach for the general framework
Thank you very much @venpopov! @SteveBronder min reviewing this, given that you are already working on something similar?
Codecov Report
Attention: Patch coverage is 78.75000% with 17 lines in your changes are missing coverage. Please review.
Project coverage is 86.56%. Comparing base (
28329d7) to head (14ec040).
:exclamation: Current head 14ec040 differs from pull request most recent head d729b32. Consider uploading reports for the commit d729b32 to get more accurate results
| Files | Patch % | Lines |
|---|---|---|
| R/inits.R | 78.75% | 17 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #930 +/- ##
==========================================
- Coverage 88.32% 86.56% -1.77%
==========================================
Files 12 13 +1
Lines 4549 4629 +80
==========================================
- Hits 4018 4007 -11
- Misses 531 622 +91
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Very cool thank you for the contribution!
The issue with pathfinder is that pathfinder is the only Stan algorithm exposed in cmdstanr that uses cmdstan's internal threading vs. dispatching multiple cmdstan processes. So you need to generate files ending in _{1...N}.json for each pathfinder respectively. It gets weird because every other algorithm uses num_procs (the number of cmdstan processes to create) while pathfinder only needs 1 proc but with multiple init files. A lot of if statements can fix this but it starts looking really patchy.
The PR I wrote is a little more heavy handed. It rips out all of the dispatching and uses cmdstan's internal threading for all the algorithms.
But I also don't have code to work over raw draws and I like your init functions much more than mine. Would you mind if I cherry picked some of this over to the branch I'm writing and added you there as a contributor?
No, I don't mind - go ahead!
Closing this as implemented in #937