deep-active-inference-mc icon indicating copy to clipboard operation
deep-active-inference-mc copied to clipboard

Calculation of Term 1 in G

Open nickypro opened this issue 1 year ago • 0 comments

I am having trouble understanding why for term 1, the sum is taken between the two computed entropy terms:

https://github.com/zfountas/deep-active-inference-mc/blob/c40ef0dd9c16c4d98abfa1f9392d9777f1899dc9/src/tfmodel.py#L343-L344

I understand this gives the sum of two entropy terms (where ps1 $= p_\tau = s_\tau|\pi$ and qs1 $= q_\tau = s_\tau|o_\tau,\pi$ ):

$$- \sum_\tau ( \frac{1}{2} \log(2 e \pi \sigma^2_{p_\tau} ) + \frac{1}{2} \log (2 e \pi \sigma^2_{q_\tau} ) )
\xrightarrow{\text{simplify}} - \sum_\tau ( H_{p_\tau} + H_{q_\tau} ) $$

But in the paper, we see that term 1 is given by:

$$\sum_\tau E_{Q(\theta | \pi)}[E_{Q(o_\tau | \theta, \pi)}[H(s_\tau | o_\tau, \pi)] - H(s_\tau | \pi)]
\xrightarrow{\text{simplify}} + \sum_\tau ( H_{q_\tau} - H_{p_\tau} ) $$

Why is there the discrepancy between the "+" and the "-"? Or where is my understanding breaking down? Am I simplifying the equations incorrectly? If so, can you explain how to correctly transform between the two?

nickypro avatar Nov 06 '23 19:11 nickypro