stan
stan copied to clipboard
Skip metric regularization if adapt window=0
Submission Checklist
- [X] Run unit tests:
./runTests.py src/test/unit - [X] Run cpplint:
make cpplint - [X] Declare copyright holder and open-source license: see below
Summary
This came up in https://github.com/stan-dev/stan/pull/3027#issuecomment-814223714
CmdStan accepts window=0 as a valid argument even though it means the adaptation cannot possibly estimate the inverse metric. The intended effect appears to be adapt only stepsize and leave the metric at its initial value, either the default identity or a user specified value. That works but only if you also set init_buffer=0. If init_buffer>0 the sampler tries to update the metric anyway and messes it up.
The metric update code is
https://github.com/stan-dev/stan/blob/35a15c8c206a09415318fbdfc0b324993929fd08/src/stan/mcmc/var_adaptation.hpp#L24-L28
The Welford estimator's sample_variance() is a no-op if there are not enough samples so that part's fine.
The problem is the subsequent regularization which in case of zero samples just discards the (unchanged) metric.
The fix is to simply skip the regularization when sample_variance() didn't do anything.
Intended Effect
Setting window=0 always keeps the initial metric.
How to Verify
Run the examples/bernoulli model with adapt init_buffer=10 window=0 and look for the mass matrix in output.csv.
On develop it says
# Diagonal elements of inverse mass matrix:
# 0.001
On this branch
# Diagonal elements of inverse mass matrix:
# 1
Side Effects
Documentation
Copyright and Licensing
Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Niko Huurre
By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
| Name | Old Result | New Result | Ratio | Performance change( 1 - new / old ) |
|---|---|---|---|---|
| gp_pois_regr/gp_pois_regr.stan | 3.36 | 3.43 | 0.98 | -2.09% slower |
| low_dim_corr_gauss/low_dim_corr_gauss.stan | 0.02 | 0.02 | 0.96 | -4.56% slower |
| eight_schools/eight_schools.stan | 0.12 | 0.12 | 0.97 | -2.87% slower |
| gp_regr/gp_regr.stan | 0.16 | 0.16 | 0.99 | -0.69% slower |
| irt_2pl/irt_2pl.stan | 6.09 | 5.99 | 1.02 | 1.52% faster |
| performance.compilation | 92.0 | 88.87 | 1.04 | 3.4% faster |
| low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan | 8.66 | 8.64 | 1.0 | 0.23% faster |
| pkpd/one_comp_mm_elim_abs.stan | 29.3 | 29.42 | 1.0 | -0.42% slower |
| sir/sir.stan | 130.24 | 119.52 | 1.09 | 8.23% faster |
| gp_regr/gen_gp_data.stan | 0.03 | 0.03 | 0.99 | -0.81% slower |
| low_dim_gauss_mix/low_dim_gauss_mix.stan | 3.01 | 3.02 | 1.0 | -0.35% slower |
| pkpd/sim_one_comp_mm_elim_abs.stan | 0.41 | 0.39 | 1.05 | 4.51% faster |
| arK/arK.stan | 1.88 | 1.84 | 1.02 | 2.01% faster |
| arma/arma.stan | 0.76 | 0.86 | 0.88 | -13.04% slower |
| garch/garch.stan | 0.56 | 0.57 | 0.99 | -1.38% slower |
| Mean result: 0.997801547886 |
Jenkins Console Log Blue Ocean Commit hash: 3830f6f1a9c7fd29dae971b263b04740f8914754
Machine information
ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz
G++: Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1 Apple LLVM version 7.0.2 (clang-700.1.81) Target: x86_64-apple-darwin15.6.0 Thread model: posix
Clang: Apple LLVM version 7.0.2 (clang-700.1.81) Target: x86_64-apple-darwin15.6.0 Thread model: posix
Although I largely agree with the update technically inverse metric updating is skipped for both adapt_window = 0 and adapt_window = 1. I'll change the pull name before merging.
If changes are made according to https://github.com/stan-dev/stan/pull/3027#issuecomment-820638658 then no warning message would be needed for the base_window = 0 case, in which case messages wouldn't have to be passed, but if inverse metric updating is also disabled for base_window = 1 then I think we still need the message.