design-docs icon indicating copy to clipboard operation
design-docs copied to clipboard

proposal for mixture models

Open bob-carpenter opened this issue 1 year ago • 14 comments

The feature proposal is in the file 0034-mixtures.md

bob-carpenter avatar Feb 14 '24 20:02 bob-carpenter

Rendered version

https://github.com/stan-dev/design-docs/blob/0034-mixtures/designs/0034-mixtures.md

SteveBronder avatar Feb 14 '24 21:02 SteveBronder

Thanks, @SteveBronder. Does it render automatically or did you have to do something to turn it on?

bob-carpenter avatar Feb 15 '24 15:02 bob-carpenter

Do you intend to support mixtures of distributions with different support (e.g. a mixture of a normal and a lognormal)?

jsocolar avatar Feb 15 '24 17:02 jsocolar

Do you intend to support mixtures of distributions with different support (e.g. a mixture of a normal and a lognormal)?

Yes. There's an example of zero-inflated Poisson that mix a distribution with support at a single point with a mixture with support over all natural numbers.

Are there problems that arise in that setting we should consider? Even if there is, support isn't something that's known statically, so it'd have to be some kind of run-time check.

bob-carpenter avatar Feb 15 '24 19:02 bob-carpenter

Are there problems that arise in that setting we should consider?

If you pass a negative value to a lognormal_lpdf, I don't think you get back -Inf, but rather something like: Exception: lognormal_lpdf: Random variable is -1, but must be nonnegative!

So I think you'd either need to wrap the input lognormal inside a function that conditionally branches depending on whether you satisfy a support check (returning -Inf otherwise), or... ?

Edit: I think the example with the delta function on zero is an example of a function with support everywhere over the integers, and a return of -Inf everywhere except zero. Or am I misapprehending the way this all works?

Edit2: Maybe the exception is exactly the sort of runtime check you have in mind, and the solution is to force the user to define their own lpdf that implements the support check and branching logic.

jsocolar avatar Feb 15 '24 20:02 jsocolar

Good point about returning exceptions. So we'll have to warn people about that.

Edit 1: Yes, that's right. We were torn when first building Stan between throwing exceptions and returning -inf for out of support args and went with the latter to make it easier for people to debug.

Edit 2: No, I just hadn't thought this through. But that is what it will force the user to do.

bob-carpenter avatar Feb 15 '24 21:02 bob-carpenter

the mixture_lpdf function could catch std::invalid_argument and return -infinity? That seems a bit too magical though, and definitely harder to debug

WardBrian avatar Feb 15 '24 22:02 WardBrian

the mixture_lpdf function could catch std::invalid_argument and return -infinity? That seems a bit too magical though, and definitely harder to debug

Do arguments arguments of invalid type result in std::invalid_argument, or something else? If the former then I think this is a no-go.

jsocolar avatar Feb 15 '24 22:02 jsocolar

Depends on what you mean by “type”. It would also be thrown for example if you give a matrix that isn’t a cholesky factor to wishart_cholesky_lpdf

WardBrian avatar Feb 15 '24 22:02 WardBrian

I mean would catching exceptions in this way allow people to end up passing integers to _lpdfs or reals to _lpmfs without seeing an exception raised?

jsocolar avatar Feb 16 '24 00:02 jsocolar

That kind of thing would be stopped by the compiler still. Invalid_argument is only used for runtime properties like bounds or requiring a matrix to be square

WardBrian avatar Feb 16 '24 00:02 WardBrian

@jsocolar noted in an offline discussion that we need to include truncation contributions to the densities, too.

bob-carpenter avatar Apr 05 '24 13:04 bob-carpenter

We also need a note about mixing pdfs and pmfs being illegal. You could wrap a pmf in a user-defined pdf, but Stan's not going to be able to handle sampling parameters that mix discrete and continuous distributions.

bob-carpenter avatar Apr 05 '24 13:04 bob-carpenter

Some citations from @avehtari:

PyMC has pymc.Mixture: https://www.pymc.io/projects/docs/en/stable/api/distributions/generated/pymc.Mixture.html

Turing.jl has MixtureModel: https://turing.ml/dev/tutorials/01-gaussian-mixture-model/

bob-carpenter avatar Apr 05 '24 16:04 bob-carpenter