Venturecxx
Venturecxx copied to clipboard
Potential Venstan example? "double" mixture model
See: https://webfiles.uci.edu/mdlee/LeeWagenmakers2013_Free.pdf ("6.4 The two country quiz") https://groups.google.com/forum/#!topic/stan-users/p2zWntwTbG0
- Some context: all of the models in the Lee & Wagenmakers book were made into Stan examples that are linked from their website, except for three (due to inability to implement them in Stan); this is one of them.
- It’s a double mixture model: discrete latent
x_i
, discrete latentz_j
, and observedk_ij
, where the distribution of thek_ij
depends on a function of thex_i
andz_j
. - Summing out the latents entirely (as you would have to do in Stan) is intractable, because they are coupled in the posterior, so you would have to sum all combinations of all the
x_i
andz_j
. - However, I think it is the case that the
x_i
are independent of each other conditioned on all thez_j
(andk_ij
), and thez_j
are independent of each other conditioned on thex_i
(andk_ij
). So one could imagine a Venture-driven/Stan-powered Gibbs sampling scheme where Venture samples thex_i
, hands it to Stan to sample the parameters integrating out thez_j
, Venture samples thez_j
, Stan samples parameters integrating outx_i
, repeat. - It’s not that interesting because this example (and all the others in the book) were already implemented in BUGS to begin with, so a Venstan implementation would not be clearly novel, but maybe we can come up with some extension that meaningfully uses the expressive power of Venture.
When @Axch and I discussed these examples in the L&W book before the PI meeting, the conclusion was we could not really tell a compelling story for Venstan with toy examples -- one running pattern I have seen in the literature is researchers take some probabilistic model which uses mixtures, and make it an infinite mixture which turns into an 80 page journal paper. Perhaps we can find something compelling using a similar approach (as @axch says tuning \alpha
is easier than tuning K
).
Strongly disagree. A much broader audience cares about L&W than eg the Dunson paper. Plus the example Anthony found highlights the limitations of Stan's claim to "handle discrete variables".
Remember the purpose of these is not to show something is possible representationally that was previously impossible, but instead to show that with Venture, one probporg lang can be used to optimize parts of a probprog written in another, using a model of acknowledged interest. The fact that a BUGS impl exists is just validation of its interest. This is technically about illustrating SP interface capabilities/consequences, and highlighting advantages of a polyglot platform.
A separate type of point could be that with Venture we can extend models (and the associated inference schemes) written in other languages while retaining some (presumably tested) parts of the original implementation. That's not an interesting pitch to people happy with the expressiveness of the other languages or dubious about the practical utility of the fancier stuff.
A good MML paper template might be to find and compress then extend papers like Dunson's but that's for a different audience still.
On Fri, Feb 26, 2016, 6:36 AM F Saad [email protected] wrote:
When @Axch https://github.com/Axch and I discussed these examples in the L&W book before the PI meeting, the conclusion was we could not really tell a compelling story for Venstan with toy examples -- one running pattern I have seen in the literature is researchers take some probabilistic model which uses mixtures, and make it an infinite mixture which turns into an 80 page journal paper http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3754453/. Perhaps we can find something compelling using a similar approach (as @axch https://github.com/axch says tuning \alpha is easier than tuning K).
— Reply to this email directly or view it on GitHub https://github.com/probcomp/Venturecxx/issues/444#issuecomment-189233658 .
Also, red flag is the word "interesting" --- key question is to whom, and do we want to reach them, and why.
On Fri, Feb 26, 2016, 10:17 AM Vikash K. Mansinghka [email protected] wrote:
Strongly disagree. A much broader audience cares about L&W than eg the Dunson paper. Plus the example Anthony found highlights the limitations of Stan's claim to "handle discrete variables".
Remember the purpose of these is not to show something is possible representationally that was previously impossible, but instead to show that with Venture, one probporg lang can be used to optimize parts of a probprog written in another, using a model of acknowledged interest. The fact that a BUGS impl exists is just validation of its interest. This is technically about illustrating SP interface capabilities/consequences, and highlighting advantages of a polyglot platform.
A separate type of point could be that with Venture we can extend models (and the associated inference schemes) written in other languages while retaining some (presumably tested) parts of the original implementation. That's not an interesting pitch to people happy with the expressiveness of the other languages or dubious about the practical utility of the fancier stuff.
A good MML paper template might be to find and compress then extend papers like Dunson's but that's for a different audience still.
On Fri, Feb 26, 2016, 6:36 AM F Saad [email protected] wrote:
When @Axch https://github.com/Axch and I discussed these examples in the L&W book before the PI meeting, the conclusion was we could not really tell a compelling story for Venstan with toy examples -- one running pattern I have seen in the literature is researchers take some probabilistic model which uses mixtures, and make it an infinite mixture which turns into an 80 page journal paper http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3754453/. Perhaps we can find something compelling using a similar approach (as @axch https://github.com/axch says tuning \alpha is easier than tuning K).
— Reply to this email directly or view it on GitHub https://github.com/probcomp/Venturecxx/issues/444#issuecomment-189233658 .
RE:
highlights the limitations of Stan's claim to "handle discrete variables".
In the thread, Carpenter clearly says that Stan can in principle handle the discretes, but it will not be tractable.
There are 2^8 possible values for x and 2^8 possible values for z in
this tiny tiny example, but already that's 2^16, or about 64000, summands.
I don't think there's any way to code this model efficiently in Stan.
You can just sum over the 2^16 values of x, z, but that's going to be
very painful to code, slow to run, and not scalable.
- Carpenter
I mention this because there appears to be an insinuation that the Stan folk are making inaccurate claims about their system.
RE
but instead to show that with Venture, one probporg lang can be used to optimize parts of a probprog written in another
The original conclusion reached by @axch and I about the 3 examples in LW book, (and why we did not implement them for the PI meeting, instead shooting for VenStanKep), was that Venture can handle these cognitive model tractably anyway. Bolting on Stan does not achieve the stated target of "optimizing parts of a probprog" because sampling the discretes in Venture then using HMC in Stan offers no obvious advantage (computational or otherwise) over using pure Venturescript.. The cognitive example felt too artificial to tell a compelling story that we are "overcoming shortcomings of both systems" by being polyglot.
Re Carpenter: it's useful for us to have that quote, and to remember the difference between "the language supporting something in principle" and "it actually being supported in practice, without severe and/or unpredictable restrictions".
Have you seen the Venture Goals document? There's a proposal for a VenStan interface that I think will lead to different conclusions.
On Fri, Feb 26, 2016 at 10:37 AM, F Saad [email protected] wrote:
highlights the limitations of Stan's claim to "handle discrete variables".
In the thread, Carpenter clearly says that Stan can in principle handle the discretes, but it will not be tractable
There are 2^8 possible values for x and 2^8 possible values for z in this tiny tiny example, but already that's 2^16, or about 64000, summands. I don't think there's any way to code this model efficiently in Stan. You can just sum over the 2^16 values of x, z, but that's going to be very painful to code, slow to run, and not scalable.
- Carpenter
but instead to show that with Venture, one probporg lang can be used to optimize parts of a probprog written in another
The original conclusion reached by @axch https://github.com/axch and I about the 3 examples in LW book, (and why we did not implement them for the PI meeting, instead shooting for VenStanKep), was that Venture can handle these cognitive model tractably anyway. Bolting on Stan does not achieve the stated target of "optimizing parts of a probprog" because sampling the discretes in Venture then using HMC in Stan offers no obvious advantage (computational or otherwise) over using pure Venturescript.. The cognitive example felt too artificial to tell a compelling story that we are "overcoming shortcomings of both systems" by being polyglot.
— Reply to this email directly or view it on GitHub https://github.com/probcomp/Venturecxx/issues/444#issuecomment-189325749 .
Project management decision: What are we doing with this? What are the trigger events that cause us to reconsider this?