numpyro icon indicating copy to clipboard operation
numpyro copied to clipboard

Question about log-likelihood for model with discrete variables

Open rafaol opened this issue 3 years ago • 2 comments

Hi,

I understand that I can use funsor.log_density() to compute the log-joint probability of a sample from a model whose discrete latent sites can be enumerated. However, I'm not sure what's the best way to compute the log-likelihood of the model while marginalising the discrete variables out.

If I simply call log_likelihood() with the enumerated model, I get the log-likelihood values for each possible realisation of the discrete variables, i.e., $p(y|\theta, c),$ where $y$ is the data, $\theta$ denotes samples of the continuous variables, and $c$ denotes a realisation of the discrete variables. However, log_likelihood doesn't tell me what is the probability of the discrete variables under the prior, i.e., $p(c)$, so that I can compute the marginal likelihood: $$p(y|\theta) = \sum_c p(y|\theta, c) p(c).$$ The way I found to make it work so far has been to use numpyro.handlers.block to hide all other variables except the observations $y$ and the discrete variables $c$, and then call funsor.log_density() on this blocked/masked model, which yields the log-marginal likelihood $\log p(y|\theta)$. I was wondering if there's a better way to do it, maybe using log_likelihood(), instead.

rafaol avatar Jul 19 '22 12:07 rafaol

How about using log_density with latent variables masked, something like

with handlers.mask(mask=False):
    # latent variables go here

Otherwise, log_density returns a trace. I guess you can also extract marginal log likelihood through that trace.

fehiepsi avatar Jul 22 '22 10:07 fehiepsi

Thanks, I'll give it a go!

rafaol avatar Jul 22 '22 15:07 rafaol

Closed because it is better to ask questions on forum

fehiepsi avatar Aug 14 '22 10:08 fehiepsi