UltraNest icon indicating copy to clipboard operation
UltraNest copied to clipboard

Propagation of evidence uncertainties

Open slzoutendijk opened this issue 2 years ago • 1 comments

  • UltraNest version: 3.3.3, 3.4.4
  • Python version: 3.6.12, 3.9.5
  • Operating System: RHEL 7.9, CentOS 7.6

Thanks for creating UltraNest, I really like the improvements relative to MultiNest. I have a question, though, which I hope you would be willing to answer.

Description

I am combining Bayesian evidence determined with UltraNest from several datasets. (Analysing the datasets separately gives me a huge speed improvement, because it keeps the dimensionality low.) In my case, the combined logZ is just the sum of individual logZ's.

I would like to calculate the statistical uncertainty of the resulting combined evidence. UltraNest gives me a logZerr for each dataset's logZ, but how do I propagate these? Are the errors of either Z or logZ (close to) normally distributed, allowing for a simple algebraic propagation? Or do I need to compute the combined logZerr from the samples? If so, how could I do that? Doing multiple runs of UltraNest to empirically find the variance of the combined logZ is unfortunately prohibitively computationally expensive.

I imagine that this question is also relevant for computing the uncertainty of a Bayes factor or posterior odds ratio.

What I Did

I have found the line where the uncertainty is calculated from the information h. However, I do not understand where the equations used to compute h come from, hence I do not know its expected distribution or how to generalize the computation to multiple independent datasets. I tried reading Buchner (2016; 2019; 2021), but could not find it there.

slzoutendijk avatar May 05 '22 16:05 slzoutendijk

Yes, you can assume the uncertainties are approximately normally distributed in logZ.

In ultranest.ReactiveNestedSampler the uncertainty is based on a family of integrators is launched which each are blinded to some of the live points (using bootstrapping). The uncertainty is then the standard distribution of the logZ estimates. This is more conservative and is more robust when the sampler is imperfect (MultiNest underestimates the uncertainties across reruns). The information term is also considered (that part is better described in Skilling's papers).

The code for a single integrator is here for the event that a live point is discarded, and either replaced or not: https://github.com/JohannesBuchner/UltraNest/blob/master/ultranest/netiter.py#L520 The code for a family of bootstrapped integrators is here: https://github.com/JohannesBuchner/UltraNest/blob/master/ultranest/netiter.py#L745 The object is created here: https://github.com/JohannesBuchner/UltraNest/blob/master/ultranest/integrator.py#L2394

JohannesBuchner avatar Sep 08 '22 21:09 JohannesBuchner