API class design

Open bob-carpenter opened this issue 2 years ago • 1 comments

Design considerations

Goal: organize classes for inference and posterior analysis API

Continuous only, or also discrete parameters? (allow discrete data and generated quantities)
Two ways to operate (easily interconvertible)
- batch: take a batch of draws of a given size
- online: take a single draw
How to deal with transforms and generated quantities like in Stan? They generate new random variables as functiosn of others, which get their own expectations, control variates, etc.
Densities just functions over vectors rather than over arbitrary sequences of data types
Could introduce a higher-level notion of a distribution like in Boost, which can involve densities and RNGs bundled; could apply to prior specification or as a way to describe posterior class (sample plus density plus derivatives)
How to store all the meta-data like configuration, dates, etc.? Probably just a dictionary that is ideally usable as input.

Models

log prior plus log liklihood is equal to log density plus constant if all specified

float log_density(vector) vector grad_log_density(vector) and matrix hessian_log_density(vector)
float log_prior(vector) and vector grad_log_prior(vector) and matrix hessian_log_prior(vector)
float log_likelihood(vector) and vector grad_log_likelihood(vector) and hessian_log_likelihood(vector)
vector prior_sample()

Monte Carlo samplers

vector sample()
array(vector) ensemble_sample()
(vector, weight) importance_sample()
sampler sampling_importance_resampler(importance_sampler)

Variational approximation

approximate samplers add approximate density, effectively giving importance sampler

importance_sampler approximate_sampler()
dictionary variational_fit()

Laplace approximation

approximate samplers add approximate density, effectively giving importance sampler

importance_sampler laplace_fit()
sampler approximate_sampler()

Control variates

Tricky because they add an expectation-specific shadow value that is averaged along with draws to compute estimate of expectation.

vector control_variate(vector draws, array(vector) gradients)

Top-level controllers

Allow user to specify:

total time to spend on inference
target ESS size (e.g., implement by iterative deepening)
number of iterations

How to handle adaptation for adaptive samplers? They are just stateful in perhaps a discrete way like Stan Phase II warmup.

How to monitor convergence of adaptation to stop adaptation and start sampling?

Posterior analysis

all algorithms work with ragged data structures

R-hat (basic, split, rank, mini group nR-hat)
ESS (basic, multi-chain R-hat based, different estimators?)
- head vs. tail
sample mean, standard-deviation
standard error (if standard-dev and ESS available)
quantiles (median, central interval, arbitrary)
log density output
other diagnostics (e.g., tree depth, divergence, etc.)

Dec 16 '22 20:12 bob-carpenter

Another issue I'd like to bring up for API design is how we run multiple samplers. For example, a sampler like random walk Metropolis generates a single Markov chain of draws (that is, an iterator of vectors). We need to be able to group a bunch of these chains together and run them. That converts a single chain interface into a multiple-chain interface like one of the ensemble methods. I'm thinking this is mainly going to be an issue when a user wants to gather up all the draws and (a) analyze posterior convergence/ESS, and (b) do inference. For (a), we need to keep the draws separated into chains, whereas for (b) we want to just throw them together into one big collection.

Jan 26 '23 20:01 bob-carpenter

bayes-kit bayes-kit copied to clipboard

API class design

Design considerations

Models

Monte Carlo samplers

Variational approximation

Laplace approximation

Control variates

Top-level controllers

Posterior analysis

bayes-kit
bayes-kit copied to clipboard