ifl-tpp icon indicating copy to clipboard operation
ifl-tpp copied to clipboard

NLL results

Open pritamqu opened this issue 2 years ago • 5 comments

Hi - I was trying your code on Hawkes-1 similar to the https://github.com/shchur/ifl-tpp/blob/master/code/interactive.ipynb file. The NLL on test set is 43.7, but the reported result in the paper 0.52. Could you please clarify if there should be an additional step be done to get the final result.

My apology if this is a naive question, I am very new in the area of TPP! image

pritamqu avatar Oct 08 '22 00:10 pritamqu

Hi, no worries, here is the explanation https://github.com/shchur/ifl-tpp#mistakes-in-the-old-version

shchur avatar Oct 08 '22 06:10 shchur

You can find the original code used for experiments in the paper here https://github.com/shchur/ifl-tpp/tree/original-code

shchur avatar Oct 08 '22 06:10 shchur

Hi, no worries, here is the explanation https://github.com/shchur/ifl-tpp#mistakes-in-the-old-version

Hi, can you kindly explain a bit more why we can not divide NLL by the real number of events in the sequence? In a paper you said this number should not depend on the sequence.

"In the old code we used to normalize the NLL of each sequence by the number of events --- this was incorrect. When computing NLL for multiple TPP sequences, we are only allowed to divide the NLL by the same number for each sequence."

Thanks in advance.

iLampard avatar Feb 05 '23 14:02 iLampard

As a simple example, consider a homogenous Poisson process with rate $\lambda$ on an interval $[0, T]$. Suppose we have observed two sequences generates by this TPP - first containing $N_1$ events and the second containing $N_2$ events - and want to estimate the parameter $\lambda$ using MLE.

Without normalization, the likelihood is $N_1\log \lambda - \lambda T +N_2\log \lambda - \lambda T = (N_1 + N_2) \log \lambda - 2 \lambda T$.

If we normalize by the LL by $T$ we get $N_1 / T \log \lambda - \lambda +N_2/T \log \lambda - \lambda = (N_1 + N_2) /T \log \lambda - 2 \lambda$. This is proportional to the unnormalized log-likelihood, so we get the same MLE of $\lambda$.

If, however, we normalize the NLL for each sequence by the # of events, we get a different LL function, and end up with the wrong MLE estimate, as you can verify yourself $\log \lambda - T/N_1 \lambda + \log \lambda - T/N_2 \lambda = 2 \log \lambda - (T/N_1 + T/N_2) \lambda$

This small example demonstrates that normalizing by # of events leads to incorrect estimation of the TPP parameters.

shchur avatar Feb 20 '23 17:02 shchur

As a simple example, consider a homogenous Poisson process with rate λ on an interval [0,T]. Suppose we have observed two sequences generates by this TPP - first containing N1 events and the second containing N2 events - and want to estimate the parameter λ using MLE.

Without normalization, the likelihood is N1log⁡λ−λT+N2log⁡λ−λT=(N1+N2)log⁡λ−2λT.

If we normalize by the LL by T we get N1/Tlog⁡λ−λ+N2/Tlog⁡λ−λ=(N1+N2)/Tlog⁡λ−2λ. This is proportional to the unnormalized log-likelihood, so we get the same MLE of λ.

If, however, we normalize the NLL for each sequence by the # of events, we get a different LL function, and end up with the wrong MLE estimate, as you can verify yourself log⁡λ−T/N1λ+log⁡λ−T/N2λ=2log⁡λ−(T/N1+T/N2)λ

This small example demonstrates that normalizing by # of events leads to incorrect estimation of the TPP parameters.

Thanks for your response. I get it.

iLampard avatar Mar 08 '23 06:03 iLampard