sbi sbi.diagnostics?

In discussion with @ppjgoncalves, we thought it would be very useful to add diagnostics to sbi to help users decide whether inference is working as it should. I created this issue to collect ideas for diagnostics to implement and for discussion on the topic more generally.

Note that while some diagnostics might be easy to write as functions, others require manual steps, so eventually we might want to make a new tutorial or incorporate them into existing ones.

Diagnostics

Dalmasso et al. 2020 describe a diagnostic for methodsapproximating the likelihood, e.g. SNLE/SNL, see also: https://proceedings.mlr.press/v161/zhao21b/zhao21b.pdf
Hermans et al. 2020 describe a ROC diagnostic in section 3.2 of the paper for SNRE/AALR
Simulation-based calibration (SBC, Talts et al. 2018). Since SBC requires inference for many observations, this would probably only work for NPE, i.e., single-round posterior estimation
Diagnostics to check whether x_o is in-distribution wrt to the training dataset
Convergence of neural network loss function
Mixing of MCMC chains, R hat, autocorrelations
Posterior predictive plots
...?

Nov 02 '20 10:11 jan-matthis

Yes, that's a very good idea!

I have code for SBC by Talts et al.. This would require some hyperparameter decision which we should discuss. But the coding should be straight forward.

Nov 02 '20 11:11 janfb

A review of the Bayesian workflow, which will likely be very relevant for this endeavour: Bayesian workflow (Gelman et al. 2020)

Nov 04 '20 13:11 ppjgoncalves

I also just noticed that our logging is very rudimentary, e.g., we only log the number of epochs trained, and the single best validation performance. It would be much better to at least log the (best) validation log prob for every epoch to see how it evolved.

Nov 04 '20 13:11 janfb

For visual analysis, posterior predictive, and others, it could be interesting to consider ArViZ.

Jan 13 '21 11:01 alvorithm

Can you please check the link to the publication by Hermans et al? The link appears not to work.

Jan 24 '22 08:01 psteinb

How about the coverage property? See Prangle D, Blum MGBB, Popovic G, Sisson SA. Diagnostic tools for approximate Bayesian computation using the coverage property. Aust N Z J Stat. 2014;56: 309–329. doi:10.1111/anzs.12087

Sep 18 '22 12:09 yoavram

Most the of the methods are implemented by now: SBC, ArviZ, coverage.

Feb 16 '24 09:02 janfb