anesthetic
anesthetic copied to clipboard
Importance sampling unexpected behaviour [and plot gallery broken]
Describe the bug
NestedSamples.importance_sample changes weights in a weird way. E.g. importance sampling by a constant offset of logL_new=0.1
changes the weights which it should not (I think) a=chain.importance_sample(0.1, action="add")
. This change of weights influences posteriors, mean, std etc. I think the easiest is to look at the weights.
Resampling with a negative offset also appears to shorten the chain, I would not expect that either.
I might just be misunderstanding the importance sampling procedure in Nested Sampling, but this is not at all what I expected from e.g. MCMC samples. Edit: My original goal was using importance sampling to understand which of my likelihoods produces which constraints on my parameters, this simple "add 0.1" example was just the minimal thing behaving weirdly.
To Reproduce
a=chain.importance_sample(0.1, action="add")
b=chain.importance_sample(-0.1, action="add")
plt.plot(chain.weights, label='original weights')
plt.plot(a.weights, label='+0.1 weights')
plt.plot(b.weights, label='-0.1 weights')
plt.legend()
plt.show()
Expected behavior I would expect the weights to stay the same, except for re-normalization. Instead they change.
Screenshots
Here I just simply plot the weights.
Additional context My samples have been generated with cobaya running pypolychord, then loading the "_polychord_raw" directory with anesthetic. I tried both the current git master branch and version python-anesthetic-git r319.2351380-2 from the AUR).
I tried to reproduce the issue with the sample file from the plot gallery but the code there does no longer work:
import requests
import tarfile
for filename in ["plikHM_TTTEEE_lowl_lowE_lensing.tar.gz","plikHM_TTTEEE_lowl_lowE_lensing_NS.tar.gz"]:
github_url = "https://github.com/williamjameshandley/cosmo_example/raw/master/"
url = github_url + filename
open(filename, 'wb').write(requests.get(url).content)
tarfile.open(filename).extractall()
from anesthetic import NestedSamples
nested_root = 'plikHM_TTTEEE_lowl_lowE_lensing_NS/NS_plikHM_TTTEEE_lowl_lowE_lensing'
nested = NestedSamples(root=nested_root)
returns RuntimeError: Not a valid nested sampling run. Require logL > logL_birth.
Hi Stefan,
the thing with nested sampling is that you need to watch out that the logL
stay greater than the logL_birth
. This corresponds to one nested sampling "step" needing to increase in likelihood (otherwise the step would be discarded). Thus, when you are subtracting from the logL
values, quite a few might end up less than the corresponding logL_birth
values and consequently need to be discarded. This effectively decreases the number of live points and thus the accuracy of your nested sampling run (reflected in larger errors on the evidence for example).
In any case, changing the logL
will affect the weights, which are computed from logL
and logL_birth
, and which in the current version I believe are normalised such that the maximum weight is 1. Is that correct @williamjameshandley? So I am not surprised that the wights change. Keep in mind that the volume of the shell of the loglikelihood dlogX
factors in as well for the weights (which is quite different from MCMC runs).
Hi Stefan,
You should be able to get the behavior you're looking for by loading samples as/turning samples into MCMCSamples and then reweighting. The critical difference here is that at the moment you are reweighting NestedSamples into another valid nested sampling run (with all the advantages e.g. can compute evidences and KL divergences). Suffice to say that this process is non-trivial and Lukas and I plan to write this up in the near future so that the theory behind this is made much clearer (rather than being hidden across several pull requests).
Best, Will
Just to be sure we’re on the same page, in general, we expect this function to alter the weights. But in the specific case of adding a constant to the log-likelihood, we expect no change (up to normalisation). Yet your plot shows something else going on.
Indeed it might be the logL births subtlety @lukashergt mentioned. I don’t remember why logL_birth isn’t updated simultaneously as logL. Was that a choice? Or is it a bug? Why shouldn’t Stefan’s code do logL += 0.1 and logL_birth += 0.1? Leaving the weights unchanged (up to normalisation)
@lukashergt Thanks, that explains the cutoff. I'd still expect the weights to approximately stay the same though, at least for the +0.1 case. (But I might well be wrong, noticing I don't know about all the Nested Sampling details.)
@williamjameshandley Thanks! I don't need the evidences for this part of my analysis so converting to MCMCSamples works and then importance sampling gives the expected results.
Actually, importance sampling the MCMC-converted chains correctly changes the logL but does not affect the weights at all, also if not reweighing by a trivial constant offset:
chain = MCMCSamples(orig_chain)
chain.tex = orig_chain.tex
fig, axes = chain.plot_1d("Omega_m", label='Original', color='blue')
logL_new = scipy.stats.norm.logpdf(chain.Omega_m, loc=0.1, scale=0.01)
reweighted = chain.importance_sample(logL_new, action='add')
reweighted.plot_1d(axes, label='Reweighted', color='red', linestyle='dashed')
if np.all(reweighted.weights == chain.weights):
print("Weights equal")
plt.legend()
plt.show()
The weights are equal and posteriors identical, even though logL_new
is not constant.
@Stefan-Heimersheim What exactly is the first plot showing? How did you generate it? Normally the samples should be ordered by weights if I am not mistaken...
@Stefan-Heimersheim What exactly is the first plot showing? How did you generate it? Normally the samples should be ordered by weights if I am not mistaken...
I just took them in the order chain.weights
returned them, they seem to go up and then down ..
@Stefan-Heimersheim Importance sampling only works in the bulk posterior region of parameter space (that goes for both nested sampling and MCMC. With a Gaussian logL_new
where the narrow peak is outside of the sampled region I am not surprised that it doesn't pick up on the new logL...
As an example I'd try maybe with a mean Omega_m=0.305
to see if the posterior gets shifted as expected.
@Stefan-Heimersheim What exactly is the first plot showing? How did you generate it? Normally the samples should be ordered by weights if I am not mistaken...
I just took them in the order
chain.weights
returned them, they seem to go up and then down ..Can you post the line that generates the plot? Still not sure what's on the x- and what on the y-axis....
The x axis is the index of the weight list, y axis is the weight value. Sorry I was lazy omitting the x axis argument in the original post, here is the equivalent line:
plt.plot(range(len(a.weights)), a.weights, label='+0.1 weights')
Edit: Original code
a=chain.importance_sample(0.1, action="add")
b=chain.importance_sample(-0.1, action="add")
plt.plot(chain.weights, label='original weights')
plt.plot(a.weights, label='+0.1 weights')
plt.plot(b.weights, label='-0.1 weights')
plt.legend()
plt.show()
@Stefan-Heimersheim Importance sampling only works in the bulk posterior region of parameter space (that goes for both nested sampling and MCMC. With a Gaussian
logL_new
where the narrow peak is outside of the sampled region I am not surprised that it doesn't pick up on the new logL...As an example I'd try maybe with a mean
Omega_m=0.305
to see if the posterior gets shifted as expected.
Same. (I would still expect a shift towards the left in the former case)
chain = MCMCSamples(orig_chain)
chain.tex = orig_chain.tex
fig, axes = chain.plot_1d("Omega_m", label='Original', color='blue'); #hist, kde, scatter
logL_new = sst.norm.logpdf(chain.Omega_m, loc=0.305, scale=0.1)
reweighted = chain.importance_sample(logL_new, action='replace')
if np.all(reweighted.weights == chain.weights):
print("Weights equal")
reweighted.plot_1d(axes, label='Reweighted', color='red', linestyle='dashed'); #hist, kde, scatter
plt.legend()
plt.show()
In the last example the scale
is huge, which isn't ideal either.
But since you used 'replace'
that shouldn't be the issue and there indeed seems to be nothing happening here.
What do the posteriors look like with NestedSamples
instead of MCMCSamples
?
Without converting to MCMC samples, it works indeed:
Just talked to @lukashergt, it seems my chains might be a bit weird due to cobaya's treatment of non-uniform priors (see this pull request) and that might be causing some of the issues with NestedSamples.
The importance sampling function of MCMCSamples seems to be wrong, missing the parts where weights are changed. I'll make a separate bug report for that.
Okay so now I am at the point where I need evidences of my importance-samples -- do I understand this correctly that importance_sample
for NestedSamples
is not (correctly) implemented yet?
Can I just compute the evidence of the importance-sampled run by computing Z_new = Z_old * np.average(additional_likelihood, weights=samples.weights)
? additional_likelihood
is the value of the new likelihood (not log) that is to be added. Derived from
Z_old = integral P(D|theta) P(theta) d theta
Z_new = integral P_additional(D|theta) P(D|theta) P(theta) d theta
= integral P_additional(D|theta) [P(D|theta) P(theta) / Z_old] d theta * Z_old
= np.average(additional_likelihood, weights=samples.weights) * Z_old
The term in square brackets is the normalized posterior so we can interpret the integral as the expectation value of the added likelihood P_additional.
I tried this out with the run_pypolychord
example code, the normal run finds an evidence of around -3+/-0.2
, and together with the additional likelihood the combined likelihood gives an evidence of around 0.1+/-0.2
.
Using the expectation value formula gives roughly the right result of ~0 although it varies between runs.
importance_sample
seems to give values around +2.8
which seems closer to the difference between the log evidences, but as said before I don't think this is correctly implemented yet. I also the produced triangle plot (grey: Old likelihood, blue: New likelihood, red: Importance sample) which seems to be not correct.
Full code to reproduce:
from numpy import pi, log, sqrt
import pypolychord
from pypolychord.settings import PolyChordSettings
from pypolychord.priors import UniformPrior
try:
from mpi4py import MPI
except ImportError:
pass
from copy import deepcopy
import scipy.stats as sst
import matplotlib.pyplot as plt
import numpy as np
#| Define a four-dimensional spherical gaussian likelihood,
#| width sigma=0.1, centered on the 0 with one derived parameter.
#| The derived parameter is the squared radius
nDims = 4
nDerived = 1
sigma = 0.1
def likelihood(theta):
""" Simple Gaussian Likelihood"""
nDims = len(theta)
r2 = sum(theta**2)
logL = -log(2*pi*sigma*sigma)*nDims/2.0
logL += -r2/2/sigma/sigma
return logL, [r2]
def importaance_likelihood(theta):
return sst.norm.logpdf(theta[0], scale=0.1, loc=0)+sst.norm.logpdf(theta[1], scale=0.1, loc=0)+sst.norm.logpdf(theta[2], scale=0.1, loc=0)
def likelihood2(theta):
""" Simple Gaussian Likelihood"""
nDims = len(theta)
r2 = sum(theta**2)
logL = -log(2*pi*sigma*sigma)*nDims/2.0
logL += -r2/2/sigma/sigma
logL += importaance_likelihood(theta)
return logL, [r2]
#| Define a box uniform prior from -1 to 1
def prior(hypercube):
""" Uniform prior from [-1,1]^D. """
return UniformPrior(-1, 1)(hypercube)
#| Optional dumper function giving run-time read access to
#| the live points, dead points, weights and evidences
def dumper(live, dead, logweights, logZ, logZerr):
print("Last dead point:", dead[-1])
#| Initialise the settings
settings = PolyChordSettings(nDims, nDerived)
settings.file_root = 'gaussian'
settings.nlive = 200
settings.do_clustering = True
settings.read_resume = False
settings2 = PolyChordSettings(nDims, nDerived)
settings2.file_root = 'gaussian2'
settings2.nlive = 200
settings2.do_clustering = True
settings2.read_resume = False
#| Run PolyChord
output = pypolychord.run_polychord(likelihood, nDims, nDerived, settings, prior, dumper)
# | log(Z) = -3.05018 +/- 0.18038 |
output2 = pypolychord.run_polychord(likelihood2, nDims, nDerived, settings2, prior, dumper)
# | log(Z) = 0.05955 +/- 0.19555 |
#| Create a paramnames files
paramnames = [('p%i' % i, r'\theta_%i' % i) for i in range(nDims)]
paramnames += [('r*', 'r')]
output.make_paramnames_files(paramnames)
output2.make_paramnames_files(paramnames)
#| Make an anesthetic plots
from anesthetic import NestedSamples
samples = NestedSamples(root= settings.base_dir + '/' + settings.file_root)
samples2 = NestedSamples(root= settings.base_dir + '/' + settings2.file_root)
fig, axes = samples.plot_2d(['p0','p1','p2','p3','r'], alpha=1, color='grey')
samples2.plot_2d(axes, alpha=0.5, color='blue')
importance_loglikes = importaance_likelihood([samples.p0,samples.p1,samples.p2])
# a) Resample using anesthetic
resamples = deepcopy(samples).importance_sample(importance_loglikes)
print("New evicende (a)", resamples.logZ()) #2.76
resamples.plot_2d(axes, alpha=0.3, color='red')
plt.show()
# b) Manually estimate new evidence
importance_likes = np.exp(importance_loglikes)
factor = np.average(importance_likes, weights=samples.weights)
print("Evidence changes by factor of", factor)
print("New evidence (b)", samples.logZ()+np.log(factor)) #-0.01
Hi @Stefan-Heimersheim. Many thanks for this extremely useful working example, which demonstrates the issue unambiguously. This does show that there is clearly a problem with the importance sampling method for nested samples. In theory the evidence should work, but the fact that contour-wise grey!=red means that there is a deeper problem. This is something @lukashergt and I have discussed in passing (e.g. it's kind of clear that the current approach would fail if one added a constant log-likelihood). I will look into this in the next two weeks, as I have a good idea as to how one fixes this problem -- in analogy to how nested sampling can compute evidences and posteriors at different thermodynamic temperatures, but working this into an importance sampling framework is a small project which I'll have time for in april.
In the meantime, your approach using the posterior average of the importance weights is both mathematically and functionally correct, and is in the spirit of nested sampling that a base run can give you the overall normalisation, and then the importance sampled run giving you the adjusted Bayes factor for the new likelihood. If you want to get the errors on this, this will be dominated by the evidence error, which can be computed as usual by sampling using samples.logZ(1000).std()
.
Very strange, I did test something very similar in the past, getting this plot:
where the N(mu,sigma)
refer to normal distributions. The green importance sample of L1 + N(2,1)
nicely recovers the orange L2
...
Unfortunately the mock data for the corresponding notebook is still on a ship on the Atlantik... so I can't rerun this, yet. But this could indicate that this worked as expected in the past, but was broken subsequently...?
Could the MCMCSamples.importance_sample
from #139 have messed this up?
@lukashergt, this came up in conversation with @htjb this afternoon -- did you get a chance to investigate the discrepancy?
Could the MCMCSamples.importance_sample from https://github.com/williamjameshandley/anesthetic/pull/139 have messed this up?
This sounds plausible. NestedSamples.importance_sample
calls MCMCSamples.importance_sample
in this line I believe
class NestedSamples(MCMCSamples):
# ...
def importance_sample(self, logL_new, action='add', inplace=False):
# ...
samples = super().importance_sample(logL_new, action=action)
Previously MCMCSamples.importance_sample
was broken and did "nothing", now that it "does something" it could break NestedSamples.importance_sample
which might have relied on a bug in MCMCSamples.importance_sample
?
Unfortunately I am not familiar enough with nested sampling at the moment to correctly implement NestedSamples.importance_sample
myself / confidently check review that code.
Sorry I didn't get around to this sooner. I just did a quick mock test (anesthetic version 2.0.0-beta.12), and my impression is that things still work.
I've used the following likelihoods, once only L1
(blue):
likelihood:
L1: 'lambda x, y: stats.multivariate_normal.logpdf([x, y], mean=[ 0, 0], cov=1)
the other time both L1
and L2
(orange):
likelihood:
L1: 'lambda x, y: stats.multivariate_normal.logpdf([x, y], mean=[ 0, 0], cov=1)
L2: 'lambda x, y: stats.multivariate_normal.logpdf([x, y], mean=[0.5, 0], cov=0.25)
I then importance sample the run that only used L1
with L2
(green):
ns3 = ns1.importance_sample(logL_new=stats.multivariate_normal.logpdf(ns1[['x', 'y']].values, mean=[0.5, 0], cov=0.25),
action='add',
inplace=False)
Orange and green should and do agree:
I haven't taken a closer look at @Stefan-Heimersheim's example, yet. I might find some time later today.
Huh, that's curious. Turns out that these test cases depend a lot on the choice of scale/std... I've reduced the example to two parameters.
Case 1: this works fine
2D-Gaussian with sigma=0.5
updated with 1D-Gaussian with sigma=0.5
For this first case anesthetic works fine. Green matches orange, where orange shows again the true target distribution (i.e. both likelihoods were sampled), and green shows the importance sampled update of blue.
Cobaya yaml
file for orange (same for blue, just without mvg2
):
likelihood:
mvg1: 'lambda a, b: stats.multivariate_normal.logpdf([a, b], mean=[0, 0], cov=0.5**2)'
mvg2: 'lambda a: stats.multivariate_normal.logpdf([a], mean=[0], cov=0.5**2)'
params:
a:
prior:
min: -10
max: 10
b:
prior:
min: -10
max: 10
sampler:
polychord:
nlive: 1000
num_repeats: 5d
do_clustering: false
Importance sampling with anesthetic:
green = blue.importance_sample(
logL_new=stats.multivariate_normal.logpdf(blue[['a']].values, mean=[0], cov=0.5**2),
action='add',
inplace=False
)
Posterior distributions, evidence, and KL-divergence all match between orange and green:
Case 2: failing
2D-Gaussian with sigma=0.1
updated with 1D-Gaussian with sigma=0.1
In this second case, the only change that I did compared to the first case is to change the width of the Gaussian likelihoods from sigma=0.5
to sigma=0.1
. Now the importance sampling fails on all levels.
Cobaya yaml
file for orange (same for blue, just without mvg2
):
likelihood:
mvg1: 'lambda a, b: stats.multivariate_normal.logpdf([a, b], mean=[0, 0], cov=0.1**2)'
mvg2: 'lambda a: stats.multivariate_normal.logpdf([a], mean=[0], cov=0.1**2)'
params:
a:
prior:
min: -10
max: 10
b:
prior:
min: -10
max: 10
sampler:
polychord:
nlive: 1000
num_repeats: 5d
do_clustering: false
Importance sampling with anesthetic:
green = blue.importance_sample(
logL_new=stats.multivariate_normal.logpdf(blue[['a']].values, mean=[0], cov=0.1**2),
action='add',
inplace=False
)
Green posteriors are very non-Gaussian (high kurtosis) and most definitely don't agree with orange, and evidence and KL-estimates disagree as well:
Not sure what to make of this... Ideas?
- Is this hinting at numerical issues?
- Is this hinting at a limit in the scope of importance sampling with nested samples? If so, it is not the known issue that there have to be sufficient samples in the importance sampled region. We know about that one, and it doesn't apply to this example...
Not sure what to make of this... Ideas?
Thanks for those experiments @lukashergt! Can you check if Case 2 behaves similarly if you convert the NestedSamples
to MCMCSamples
before doing the importance sampling? This would show if it is something to do with sampling density / numerical issues, or with the particular importance sampling implementation for NestedSamples
.
Nice test! And indeed, converting the nested samples from Case 2 to MCMC samples and then doing the importance sampling gets you to the correct posterior:
mc7 = MCMCSamples(data=ns7[['a', 'b']], logL=ns7.logL)
mc8 = MCMCSamples(data=ns8[['a', 'b']], logL=ns8.logL)
mc9 = mc7.importance_sample(logL_new=stats.multivariate_normal.logpdf(ns7[['a']].values, mean=[0], cov=0.1**2),
action='add',
inplace=False)
So it looks like it isn't a numerical issue...
But then why does sigma=0.5
look fine, while sigma=0.1
fails utterly?