pml2-book icon indicating copy to clipboard operation
pml2-book copied to clipboard

Chapter 10.1, notation of latent variables

Open davidmarttila opened this issue 3 months ago • 0 comments

For sections 10.1.3 -- 10.1.5, before amortized VI is introduced, should the latent variable $z$ be notated as $z_n$, i.e. with the same index as the observation $x_n$? The observation $x_n$ has an actual, specific value. The latent variable $z_n$ however only appears as a dependent variable of a distribution or an integration.

Using eq. (10.29) as an example, dropping the index for $z$ would result in

$$\text{Ł}(\theta, \phi|\mathcal{D}) = \sum_{n=1}^N\left[\mathbb{E}{q\phi(z|x_n)} \left[\log p_\theta(x_n, z) - \log q_\phi(z|x_n)\right]\right]$$

which I think is still mathematically correct? To me, using an index where it's not required seems slightly misleading, implying a dependence on $n$ or $x_n$ even though there is none (instead of $z$, it's $\phi_n$ that depends on $x_n$). Dropping the index I think is also more consistent with variational models estimating distributions over $z|x_n$, rather than specific values $z_n|x_n$.

I think the same argument would apply to (10.19), (10.20), (10.27), (10.28), and (10.30).

davidmarttila avatar Sep 28 '25 18:09 davidmarttila