OnlineStats.jl OnlineStats for Bayesian modeling?

trafficstars

Hi Josh,

I've been moving toward MCMC results being in the form of an iterator instead of an array, and encouraging others in this direction as well. This convenience and flexibility in a lot of different ways.

There seems to be some interest in this approach from the Turing team: https://github.com/TuringLang/AdvancedHMC.jl/issues/101#issuecomment-531494672

And Tamas Papp is also trying this out for DynamicHMC: https://github.com/tpapp/DynamicHMC.jl/pull/94

Have you done or seen anything in this direction for OnlineStats?

The general idea is to specify a stopping criterion, say a standard error on the mean estimate of some function of the posterior sample. I think it will also be nice to have a way to deal with intermediate results.

A few things are needed for this approach, most already available:

Mean and variance
Standard error of mean estimate
Effective sample size, for use on its own and also in standard error. Depending on the context, this is computed in terms of autocorrelations or sample weight.
Rank-normalized R-hat

Any thoughts on this?

Sep 16 '19 03:09 cscherrer

I haven't done anything MCMC in a while, but I think OnlineStats has all the pieces you need (means, variances, and autocorrelations).

The implementations of Mean and Variance live in OnlineStatsBase, so if you're looking to add minimal dependencies you can go that route. I should probably move AutoCov over there as well.

Sep 16 '19 11:09 joshday

@cscherrer if you're thinking of making a BayesianOnlineStats package I'd be happy to contribute. It'll be a good excuse to spend more time thinking about how to work with streaming samples and to learn OnlineStats.

I think BFMI as well could be supported. It only requires Mean and Variance.

Nov 23 '19 01:11 sethaxen

Nice! I haven't thought about this much in a few months, but I do think it's important. Currently the best I have is using Transducers: https://github.com/cscherrer/QuasiMonteCarlo.jl

There are really two independent concern here -- QMC and stream combinators -- but this made a nice sandbox for trying out some ideas.

I think my mental model of the current Julia approach was a bit off. Haskell has a nice "stream fusion" approach that lets you apply a sequence of transformations to a stream without a performance penalty. Transducers is a bit like this turned on its head - there, the transformations compose nicely, as long as you don't actually apply them at each step.

Nov 23 '19 16:11 cscherrer

OnlineStats.jl OnlineStats.jl copied to clipboard

OnlineStats for Bayesian modeling?

OnlineStats.jl
OnlineStats.jl copied to clipboard