blangSDK
blangSDK copied to clipboard
Clustering data structure, DP prior and split-merge sampler
Start with just a simpler one point at time sampler - annealing migh be enough.
Need:
- Exch<T> : both for large observed data and internal to DPs
- Matrix getSuffStat(T => Matrix) + typical ones moments, sum of logs, etc [always additive, so can assume will return a matrix; wait to see if other types of suff stats are needed, probably not?]
- separate observed vs unobserved for efficiency (and put @Immutable on former) <- no, will complicate things for DP machinery
- add and remove (remove needed for DP machinery); take cares of listeners, etc
- parser, writer
- ExchIntVar, ExchRealVar, ExchMatrix
- addSuffStatCollector
- SuffStatCollector has addToSuffStat
- Then need some distributions on Exch stuff, e.g. PoiGamma, etc. Internally, they just call another one e.g. NegBinomial with parameters extracted from the suffstat
- Then need clustering data structure
- Move: a one bit change thing. Argg at the end we pay too much by number of clusters with that. Oh well can create a sampler attached to a model. Still useful for DP vs PitmanYor vs discrete.
- Prior over clusterings
- ClusterLikelihood:
exch<T> | cluster, Distrib<Exch>