sbi icon indicating copy to clipboard operation
sbi copied to clipboard

sbi.diagnostics?

Open jan-matthis opened this issue 5 years ago • 6 comments

In discussion with @ppjgoncalves, we thought it would be very useful to add diagnostics to sbi to help users decide whether inference is working as it should. I created this issue to collect ideas for diagnostics to implement and for discussion on the topic more generally.

Note that while some diagnostics might be easy to write as functions, others require manual steps, so eventually we might want to make a new tutorial or incorporate them into existing ones.

Diagnostics

  • Dalmasso et al. 2020 describe a diagnostic for methodsapproximating the likelihood, e.g. SNLE/SNL, see also: https://proceedings.mlr.press/v161/zhao21b/zhao21b.pdf
  • Hermans et al. 2020 describe a ROC diagnostic in section 3.2 of the paper for SNRE/AALR
  • Simulation-based calibration (SBC, Talts et al. 2018). Since SBC requires inference for many observations, this would probably only work for NPE, i.e., single-round posterior estimation
  • Diagnostics to check whether x_o is in-distribution wrt to the training dataset
  • Convergence of neural network loss function
  • Mixing of MCMC chains, R hat, autocorrelations
  • Posterior predictive plots
  • ...?

jan-matthis avatar Nov 02 '20 10:11 jan-matthis

Yes, that's a very good idea!

I have code for SBC by Talts et al.. This would require some hyperparameter decision which we should discuss. But the coding should be straight forward.

janfb avatar Nov 02 '20 11:11 janfb

A review of the Bayesian workflow, which will likely be very relevant for this endeavour: Bayesian workflow (Gelman et al. 2020)

ppjgoncalves avatar Nov 04 '20 13:11 ppjgoncalves

I also just noticed that our logging is very rudimentary, e.g., we only log the number of epochs trained, and the single best validation performance. It would be much better to at least log the (best) validation log prob for every epoch to see how it evolved.

janfb avatar Nov 04 '20 13:11 janfb

For visual analysis, posterior predictive, and others, it could be interesting to consider ArViZ.

alvorithm avatar Jan 13 '21 11:01 alvorithm

Can you please check the link to the publication by Hermans et al? The link appears not to work.

psteinb avatar Jan 24 '22 08:01 psteinb

How about the coverage property? See Prangle D, Blum MGBB, Popovic G, Sisson SA. Diagnostic tools for approximate Bayesian computation using the coverage property. Aust N Z J Stat. 2014;56: 309–329. doi:10.1111/anzs.12087

yoavram avatar Sep 18 '22 12:09 yoavram

Most the of the methods are implemented by now: SBC, ArviZ, coverage.

janfb avatar Feb 16 '24 09:02 janfb