MXFusion icon indicating copy to clipboard operation
MXFusion copied to clipboard

Add MCMC Sampling

Open meissnereric opened this issue 6 years ago • 5 comments

Description of changes

This is a design proposal and code for adding an MCMC sampler to MXFusion.

Right now the method just uses a Normal distribution as the proposal distribution, but it could be extended to take in other proposal distributions. Open to suggestions here, but also happy to merge this in for now as a first and improve the generality later on as needed (should be straightforward.)

Testing

It successfully trained the Getting Started tutorial and the PPCA tutorial after adding a prior to m.w and using variance=1e-4 for the proposal distributions.

Also wrote two basic 'run-through' tests for the code. It doesn't have any real correctness tests outside of the manual PPCA test that I ran to verify that it trained right. Would be open to adding one, but I'm not sure what that looks like in this case.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

meissnereric avatar Nov 16 '18 11:11 meissnereric

Codecov Report

Merging #128 into develop will increase coverage by 0.25%. The diff coverage is 95.91%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #128      +/-   ##
===========================================
+ Coverage     85.1%   85.36%   +0.25%     
===========================================
  Files           77       78       +1     
  Lines         3814     3908      +94     
  Branches       653      673      +20     
===========================================
+ Hits          3246     3336      +90     
- Misses         375      376       +1     
- Partials       193      196       +3
Impacted Files Coverage Δ
mxfusion/inference/inference.py 81.81% <ø> (ø) :arrow_up:
mxfusion/components/distributions/distribution.py 91.11% <ø> (ø) :arrow_up:
mxfusion/models/model.py 100% <ø> (ø) :arrow_up:
mxfusion/inference/inference_parameters.py 89.39% <100%> (+0.5%) :arrow_up:
mxfusion/models/factor_graph.py 85% <100%> (+0.1%) :arrow_up:
mxfusion/inference/mh_sampling.py 95.45% <95.45%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update aa45a3b...3fa1dcc. Read the comment docs.

codecov-io avatar Jan 10 '19 14:01 codecov-io

@meissnereric how close is Gibbs sampling to getting merged?

anirudhacharya avatar Mar 23 '19 01:03 anirudhacharya

@meissnereric how close is Gibbs sampling to getting merged?

Thanks for you interest. This implementation only contains Metropolis Hasting and does not support Gibbs sampling, because a proper Gibbs sampling algorithm needs to recognize conjugacy in graphical model, which is not supported in MXFusion at the moment.

If you have a special need for Gibbs sampling or other related inference methods, please let us know.

zhenwendai avatar Mar 25 '19 16:03 zhenwendai

@zhenwendai My use case was to build a topic model like LDA( or similar PGMs) with MXFusion. And yes, having Gibbs sampling would be good.

anirudhacharya avatar Mar 25 '19 17:03 anirudhacharya

@anirudhacharya Implementing a LDA model with stochastic variational inference is hard. See this issue about implementing LDA in Edward (https://github.com/blei-lab/edward/issues/463). Gibbs sampling implementation has very different comparing the inference methods that we have already implemented, because it relies on rule matching for identifying conjugacy in model definition. We do not have plan to extend into this direction at the moment. If you motivate to implement an inference algorithm in this direction, we are very happy to take your contribution. A reference implementation of Gibbs sampling can be found in Edward (https://github.com/blei-lab/edward/blob/master/edward/inferences/gibbs.py).

zhenwendai avatar Mar 28 '19 14:03 zhenwendai