stan icon indicating copy to clipboard operation
stan copied to clipboard

Skip metric regularization if adapt window=0

Open nhuurre opened this issue 4 years ago • 3 comments

Submission Checklist

  • [X] Run unit tests: ./runTests.py src/test/unit
  • [X] Run cpplint: make cpplint
  • [X] Declare copyright holder and open-source license: see below

Summary

This came up in https://github.com/stan-dev/stan/pull/3027#issuecomment-814223714

CmdStan accepts window=0 as a valid argument even though it means the adaptation cannot possibly estimate the inverse metric. The intended effect appears to be adapt only stepsize and leave the metric at its initial value, either the default identity or a user specified value. That works but only if you also set init_buffer=0. If init_buffer>0 the sampler tries to update the metric anyway and messes it up.

The metric update code is https://github.com/stan-dev/stan/blob/35a15c8c206a09415318fbdfc0b324993929fd08/src/stan/mcmc/var_adaptation.hpp#L24-L28 The Welford estimator's sample_variance() is a no-op if there are not enough samples so that part's fine. The problem is the subsequent regularization which in case of zero samples just discards the (unchanged) metric. The fix is to simply skip the regularization when sample_variance() didn't do anything.

Intended Effect

Setting window=0 always keeps the initial metric.

How to Verify

Run the examples/bernoulli model with adapt init_buffer=10 window=0 and look for the mass matrix in output.csv. On develop it says

# Diagonal elements of inverse mass matrix:
# 0.001

On this branch

# Diagonal elements of inverse mass matrix:
# 1

Side Effects

Documentation

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Niko Huurre

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

  • Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
  • Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

nhuurre avatar Apr 12 '21 07:04 nhuurre


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.36 3.43 0.98 -2.09% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.96 -4.56% slower
eight_schools/eight_schools.stan 0.12 0.12 0.97 -2.87% slower
gp_regr/gp_regr.stan 0.16 0.16 0.99 -0.69% slower
irt_2pl/irt_2pl.stan 6.09 5.99 1.02 1.52% faster
performance.compilation 92.0 88.87 1.04 3.4% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.66 8.64 1.0 0.23% faster
pkpd/one_comp_mm_elim_abs.stan 29.3 29.42 1.0 -0.42% slower
sir/sir.stan 130.24 119.52 1.09 8.23% faster
gp_regr/gen_gp_data.stan 0.03 0.03 0.99 -0.81% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.01 3.02 1.0 -0.35% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.41 0.39 1.05 4.51% faster
arK/arK.stan 1.88 1.84 1.02 2.01% faster
arma/arma.stan 0.76 0.86 0.88 -13.04% slower
garch/garch.stan 0.56 0.57 0.99 -1.38% slower
Mean result: 0.997801547886

Jenkins Console Log Blue Ocean Commit hash: 3830f6f1a9c7fd29dae971b263b04740f8914754


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU: Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++: Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1 Apple LLVM version 7.0.2 (clang-700.1.81) Target: x86_64-apple-darwin15.6.0 Thread model: posix

Clang: Apple LLVM version 7.0.2 (clang-700.1.81) Target: x86_64-apple-darwin15.6.0 Thread model: posix

stan-buildbot avatar Apr 12 '21 08:04 stan-buildbot

Although I largely agree with the update technically inverse metric updating is skipped for both adapt_window = 0 and adapt_window = 1. I'll change the pull name before merging.

betanalpha avatar Apr 15 '21 16:04 betanalpha

If changes are made according to https://github.com/stan-dev/stan/pull/3027#issuecomment-820638658 then no warning message would be needed for the base_window = 0 case, in which case messages wouldn't have to be passed, but if inverse metric updating is also disabled for base_window = 1 then I think we still need the message.

betanalpha avatar Apr 15 '21 18:04 betanalpha