Adaptive proposals
Hi, I implemented some very basic adaptive Metropolis-Hastings proposals that may be of interest for the AdvancedMH.jl package. These are basically simple symmetric random-walk proposals where the scale is subjected to diminishing adaptation (see e.g. section 3 here). These are mainly of interest in Metropolis-within-Gibbs-like algorithms, and they could be of interest in Turing. For instance, for a simple test case:
data = rand(Normal(2., 1.), 500)
m1 = DensityModel(x -> loglikelihood(Normal(x,1), data))
p1 = RandomWalkProposal(Normal())
p2 = AdaptiveProposal(Normal())
c1 = sample(m1, MetropolisHastings(p1), 10000; chain_type=Chains)
c2 = sample(m1, MetropolisHastings(p2), 10000; chain_type=Chains)
julia> ess(c1)
ESS
parameters ess rhat
Symbol Float64 Float64
param_1 605.4018 1.0013
julia> ess(c2)
ESS
parameters ess rhat
Symbol Float64 Float64
param_1 2222.6857 1.0006
Also, this is my first PR to a collaborative project, and am taking this as an opportunity to learn how to contribute to projects using git (hopefully with something useful). In other words, I'm not sure if I'm doing this right :)
If I find some time, and this is considered useful, I'd like to port some other related stuff I've used in other projects to AdvancedMH as well.
The test failures seem to be real, the sampled values do not match the expected ones. As a side remark, probably it would be helpful for debugging to use sample(...; progress = false) in the tests.
The test failures seem to be real, the sampled values do not match the expected ones. As a side remark, probably it would be helpful for debugging to use
sample(...; progress = false)in the tests.
Where do you see this? As far as I can tell, the three failed runs show the following results:
Test Summary: | Pass Fail Error Total
AdvancedMH | 20 2 1 23
StaticMH | 4 4
RandomWalk | 4 4
Adaptive random walk | 4 4
Compare adaptive to simple random walk | 1 1
parallel sampling | 2 2 4
Proposal styles | 4 4
Initial parameters | 1 1
EMCEE | 4 1
(which is the same as what I have locally). Maybe I'm looking in the wrong place.
Where do you see this? As far as I can tell, the three failed runs show the following results:
I looked at the raw logs, e.g., https://pipelines.actions.githubusercontent.com/1iyxZmDwGspc6UbQKDLTZ8y3DHazaAGCEXSw02G4avGy4Eell5/_apis/pipelines/1/runs/7302/signedlogcontent/15?urlExpires=2020-11-23T10%3A22%3A22.7581222Z&urlSigningMethod=HMACV1&urlSignature=%2FUDiYzMH2mBCIhIroN0uueS%2FjYsHAv7zGwfFj%2FCie90%3D. The nicely formatted output of the Github Actions is truncated since progress bars are printed (which is why I suggested to use sample(...; progress = false)).
The raw logs are difficult to read but according to the final statement there are two test failures somewhere. And actually, if you scroll up a bit you can find
2020-11-23T07:54:54.1228900Z Expression: ≈(mean(chain1["μ"]), 0.0, atol = 0.1)
2020-11-23T07:54:54.8851780Z Evaluated: 1.9981419463187078 ≈ 0.0 (atol=0.1)
2020-11-23T07:54:54.8852673Z Stacktrace:
2020-11-23T07:54:54.8853787Z [1] [0m[1mmacro expansion[22m
2020-11-23T07:54:54.9081715Z [90m @ [39m[90m~/work/AdvancedMH.jl/AdvancedMH.jl/test/[39m[90;4mruntests.jl:87[0m[90m [inlined][39m
2020-11-23T07:54:54.9083060Z [2] [0m[1mmacro expansion[22m
2020-11-23T07:54:54.9084504Z [90m @ [39m[90m/buildworker/worker/package_linux32/build/usr/share/julia/stdlib/v1.6/Test/src/[39m[90;4mTest.jl:1146[0m[90m [inlined][39m
2020-11-23T07:54:54.9086203Z [3] [0m[1mmacro expansion[22m
2020-11-23T07:54:54.9087509Z [90m @ [39m[90m~/work/AdvancedMH.jl/AdvancedMH.jl/test/[39m[90;4mruntests.jl:83[0m[90m [inlined][39m
2020-11-23T07:54:54.9088694Z [4] [0m[1mmacro expansion[22m
2020-11-23T07:54:54.9090192Z [90m @ [39m[90m/buildworker/worker/package_linux32/build/usr/share/julia/stdlib/v1.6/Test/src/[39m[90;4mTest.jl:1146[0m[90m [inlined][39m
2020-11-23T07:54:54.9091420Z [5] top-level scope
2020-11-23T07:54:54.9092587Z [90m @ [39m[90m~/work/AdvancedMH.jl/AdvancedMH.jl/test/[39m[90;4mruntests.jl:11[0m
2020-11-23T07:54:55.3782818Z [33m[1m┌ [22m[39m[33m[1mWarning: [22m[39mOnly a single thread available: MCMC chains are not sampled in parallel
2020-11-23T07:54:55.3785177Z [33m[1m└ [22m[39m[90m@ AbstractMCMC ~/.julia/packages/AbstractMCMC/lGdBK/src/sample.jl:204[39m
2020-11-23T07:54:55.5398596Z [32mSampling (1 threads) 0%| | ETA: N/A[39m
2020-11-23T07:54:56.5060198Z [32mSampling (1 threads) 25%|███████▌ | ETA: 0:00:03[39m
2020-11-23T07:54:56.5061792Z [32mSampling (1 threads) 50%|███████████████ | ETA: 0:00:01[39m
2020-11-23T07:54:56.5063683Z [32mSampling (1 threads) 75%|██████████████████████▌ | ETA: 0:00:00[39m
2020-11-23T07:54:56.5065281Z [32mSampling (1 threads) 100%|██████████████████████████████| Time: 0:00:00[39m
2020-11-23T07:54:56.5066874Z [90mSampling (1 threads) 100%|██████████████████████████████| Time: 0:00:00[39m
2020-11-23T07:54:56.5657764Z [37mparallel sampling: [39m[91m[1mTest Failed[22m[39m at [39m[1m/home/runner/work/AdvancedMH.jl/AdvancedMH.jl/test/runtests.jl:93[22m
2020-11-23T07:54:56.5659855Z Expression: ≈(mean(chain2["μ"]), 0.0, atol = 0.1)
2020-11-23T07:54:56.5865659Z Evaluated: 1.9910977829961962 ≈ 0.0 (atol=0.1)
which I was referring to. Apparently the sample average of the samples of μ do not match the expected value 0 (and actually, differ quite significantly).
I just checked locally, and all tests pass with the latest release and with master. So it seems this PR breaks something.
Where do you see this? As far as I can tell, the three failed runs show the following results:
I looked at the raw logs, e.g., https://pipelines.actions.githubusercontent.com/1iyxZmDwGspc6UbQKDLTZ8y3DHazaAGCEXSw02G4avGy4Eell5/_apis/pipelines/1/runs/7302/signedlogcontent/15?urlExpires=2020-11-23T10%3A22%3A22.7581222Z&urlSigningMethod=HMACV1&urlSignature=%2FUDiYzMH2mBCIhIroN0uueS%2FjYsHAv7zGwfFj%2FCie90%3D. The nicely formatted output of the Github Actions is truncated since progress bars are printed (which is why I suggested to use
sample(...; progress = false)).The raw logs are difficult to read but according to the final statement there are two test failures somewhere. And actually, if you scroll up a bit you can find
2020-11-23T07:54:54.1228900Z Expression: ≈(mean(chain1["μ"]), 0.0, atol = 0.1) 2020-11-23T07:54:54.8851780Z Evaluated: 1.9981419463187078 ≈ 0.0 (atol=0.1) 2020-11-23T07:54:54.8852673Z Stacktrace: 2020-11-23T07:54:54.8853787Z [1] �[0m�[1mmacro expansion�[22m 2020-11-23T07:54:54.9081715Z �[90m @ �[39m�[90m~/work/AdvancedMH.jl/AdvancedMH.jl/test/�[39m�[90;4mruntests.jl:87�[0m�[90m [inlined]�[39m 2020-11-23T07:54:54.9083060Z [2] �[0m�[1mmacro expansion�[22m 2020-11-23T07:54:54.9084504Z �[90m @ �[39m�[90m/buildworker/worker/package_linux32/build/usr/share/julia/stdlib/v1.6/Test/src/�[39m�[90;4mTest.jl:1146�[0m�[90m [inlined]�[39m 2020-11-23T07:54:54.9086203Z [3] �[0m�[1mmacro expansion�[22m 2020-11-23T07:54:54.9087509Z �[90m @ �[39m�[90m~/work/AdvancedMH.jl/AdvancedMH.jl/test/�[39m�[90;4mruntests.jl:83�[0m�[90m [inlined]�[39m 2020-11-23T07:54:54.9088694Z [4] �[0m�[1mmacro expansion�[22m 2020-11-23T07:54:54.9090192Z �[90m @ �[39m�[90m/buildworker/worker/package_linux32/build/usr/share/julia/stdlib/v1.6/Test/src/�[39m�[90;4mTest.jl:1146�[0m�[90m [inlined]�[39m 2020-11-23T07:54:54.9091420Z [5] top-level scope 2020-11-23T07:54:54.9092587Z �[90m @ �[39m�[90m~/work/AdvancedMH.jl/AdvancedMH.jl/test/�[39m�[90;4mruntests.jl:11�[0m 2020-11-23T07:54:55.3782818Z �[33m�[1m┌ �[22m�[39m�[33m�[1mWarning: �[22m�[39mOnly a single thread available: MCMC chains are not sampled in parallel 2020-11-23T07:54:55.3785177Z �[33m�[1m└ �[22m�[39m�[90m@ AbstractMCMC ~/.julia/packages/AbstractMCMC/lGdBK/src/sample.jl:204�[39m 2020-11-23T07:54:55.5398596Z �[32mSampling (1 threads) 0%| | ETA: N/A�[39m 2020-11-23T07:54:56.5060198Z �[32mSampling (1 threads) 25%|███████▌ | ETA: 0:00:03�[39m 2020-11-23T07:54:56.5061792Z �[32mSampling (1 threads) 50%|███████████████ | ETA: 0:00:01�[39m 2020-11-23T07:54:56.5063683Z �[32mSampling (1 threads) 75%|██████████████████████▌ | ETA: 0:00:00�[39m 2020-11-23T07:54:56.5065281Z �[32mSampling (1 threads) 100%|██████████████████████████████| Time: 0:00:00�[39m 2020-11-23T07:54:56.5066874Z �[90mSampling (1 threads) 100%|██████████████████████████████| Time: 0:00:00�[39m 2020-11-23T07:54:56.5657764Z �[37mparallel sampling: �[39m�[91m�[1mTest Failed�[22m�[39m at �[39m�[1m/home/runner/work/AdvancedMH.jl/AdvancedMH.jl/test/runtests.jl:93�[22m 2020-11-23T07:54:56.5659855Z Expression: ≈(mean(chain2["μ"]), 0.0, atol = 0.1) 2020-11-23T07:54:56.5865659Z Evaluated: 1.9910977829961962 ≈ 0.0 (atol=0.1)which I was referring to. Apparently the sample average of the samples of
μdo not match the expected value 0 (and actually, differ quite significantly).I just checked locally, and all tests pass with the latest release and with master. So it seems this PR breaks something.
OK, thanks, I saw that too but was confused since I don't think I changed anything related to parallel sampling, and I thought you were referring to the AdaptiveProposal tests. I'll have a look.
Hi there, a question related to this PR (feel free to move it to the issue section, I'm not sure what the best place to drop this is).
I'm currently facing an application where I would really like to use adaptive proposals like those defined in this PR in a Metropolis-within-Gibbs setting (i.e. we have a parameter vector x, for each parameter have an adaptive univariate proposal, and in each iteration of the MCMC sampler we update each component of the parameter vector conditional on the others using a Metropolis-Hastings step). The Turing-way to go would seem to use the stuff implemented in AdvancedMH in a Turing composite Gibbs sampler (something roughly like Gibbs(AdaptiveMH(:p1), AdaptiveMH(:p2), ...) where the p1, p2, ... are the parameter vector components)? I think in general this is worthwhile for low-dimensional applications where the gradient of the loglikelihood is really costly or unavailable. I wonder what would be the best way to proceed to allow this? Thanks for any hints!
Yeah, this should probably be a separate issue -- can you copy it over and I'll drop my comments there?
It looks like there's only one thing to change in this PR -- if that is updated I'd be happy to merge this one in.
OK, I had implemented multivariate normal adaptive proposals and had it working so thought to add it to this PR. Not sure how the test pipeline got messed up?
Anything missing from this PR? If there is nothing major standing in the way, maybe we should revisit and get it merged asap.
Hi, I should have a look at it again, @devmotion had some good suggestions on the multivariate adaptive normal kernel, which I did not yet look into further at as I got busy with other stuff. The univariate adaptive proposals should be fine I think.
Is this still of interest/relevant? I'd definitely be interested in using this and may be able to help finish it up.
Is this still of interest/relevant? I'd definitely be interested in using this and may be able to help finish it up.
It's still relevant; please feel free to take this over.