blangSDK icon indicating copy to clipboard operation
blangSDK copied to clipboard

Clustering data structure, DP prior and split-merge sampler

Open alexandrebouchard opened this issue 9 years ago • 1 comments

alexandrebouchard avatar Aug 13 '16 02:08 alexandrebouchard

Start with just a simpler one point at time sampler - annealing migh be enough.

Need:

  • Exch<T> : both for large observed data and internal to DPs
    • Matrix getSuffStat(T => Matrix) + typical ones moments, sum of logs, etc [always additive, so can assume will return a matrix; wait to see if other types of suff stats are needed, probably not?]
    • separate observed vs unobserved for efficiency (and put @Immutable on former) <- no, will complicate things for DP machinery
    • add and remove (remove needed for DP machinery); take cares of listeners, etc
    • parser, writer
  • ExchIntVar, ExchRealVar, ExchMatrix
    • addSuffStatCollector
  • SuffStatCollector has addToSuffStat
  • Then need some distributions on Exch stuff, e.g. PoiGamma, etc. Internally, they just call another one e.g. NegBinomial with parameters extracted from the suffstat
  • Then need clustering data structure
  • Move: a one bit change thing. Argg at the end we pay too much by number of clusters with that. Oh well can create a sampler attached to a model. Still useful for DP vs PitmanYor vs discrete.
  • Prior over clusterings
  • ClusterLikelihood: exch<T> | cluster, Distrib<Exch>

alexandrebouchard avatar Nov 30 '17 19:11 alexandrebouchard