deepar icon indicating copy to clipboard operation
deepar copied to clipboard

Hello, I have some questions about the loss function of the Gaussian distribution

Open SPOREIII opened this issue 4 years ago • 2 comments

The loss function of the Gaussian distribution given in the code is as follows:

tf.reduce_mean(0.5*tf.math.log(sigma) + 0.5*tf.math.truediv(tf.math.square(y_true - y_pred), sigma)) + 1e-6 + 6  

I think it can be expressed as the following mathematical formula: \frac{{\rm{1}}}{N} \times \sum {(\frac{1}{2} \times \log (\sigma ) + \frac{1}{2} \times \frac{{{{(z - \mu )}^2}}}{\sigma })}  + 1 \times {10^{ - 6}} + 6 But according to the formula in the original text(DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks), I derived the following result:  - 1 \times \frac{{\rm{1}}}{N} \times \sum {(\log (\frac{1}{{\sqrt {2\pi } \sigma }}{e^{ - \frac{{{{(z - \mu )}^2}}}{{2{\sigma ^2}}}}}))}  = \frac{{\rm{1}}}{N} \times \sum {(\log (\sqrt {2\pi } \sigma ) + \frac{{{{(z - \mu )}^2}}}{{2{\sigma ^2}}})}  = \frac{{\rm{1}}}{N} \times \sum {(\log (\sqrt {2\pi } \sigma ) + \frac{{{{(z - \mu )}^2}}}{{2{\sigma ^2}}})}  = \frac{{\rm{1}}}{N} \times \sum {(\log (\sqrt {2\pi } ) + \log (\sigma ) + \frac{{{{(z - \mu )}^2}}}{{2{\sigma ^2}}})} I don't know where I got it wrong and I cannot get the loss function form given in the code.

SPOREIII avatar Jun 02 '20 10:06 SPOREIII

@SPOREIII I've changed this to your formula. Please check.

benman1 avatar Oct 04 '21 20:10 benman1

Thanks for the question. This is a very old repo, but @benman1 did a great job cleaning up the mess recently. Regarding the loss function: the one previously used was wrong of course, and needs to be corrected according to what you are reporting above. As further discussion topic, we may consider using the new probabilistic layers introduced with Tensorflow Probability here , where they use a simple lambda function to directly reference the negative log likelihood of a Gaussian distribution: negloglik = lambda y, rv_y: -rv_y.log_prob(y). In this case we would not even need to express the likelihood explicitly

arrigonialberto86 avatar Oct 04 '21 20:10 arrigonialberto86