Clustering data structure, DP prior and split-merge sampler

Open alexandrebouchard opened this issue 9 years ago • 1 comments

Aug 13 '16 02:08 alexandrebouchard

Start with just a simpler one point at time sampler - annealing migh be enough.

Need:

Exch<T> : both for large observed data and internal to DPs
- Matrix getSuffStat(T => Matrix) + typical ones moments, sum of logs, etc [always additive, so can assume will return a matrix; wait to see if other types of suff stats are needed, probably not?]
- separate observed vs unobserved for efficiency (and put @Immutable on former) <- no, will complicate things for DP machinery
- add and remove (remove needed for DP machinery); take cares of listeners, etc
- parser, writer
ExchIntVar, ExchRealVar, ExchMatrix
- addSuffStatCollector
SuffStatCollector has addToSuffStat
Then need some distributions on Exch stuff, e.g. PoiGamma, etc. Internally, they just call another one e.g. NegBinomial with parameters extracted from the suffstat
Then need clustering data structure
Move: a one bit change thing. Argg at the end we pay too much by number of clusters with that. Oh well can create a sampler attached to a model. Still useful for DP vs PitmanYor vs discrete.
Prior over clusterings
ClusterLikelihood: exch<T> | cluster, Distrib<Exch>

Nov 30 '17 19:11 alexandrebouchard