GRU-D
GRU-D copied to clipboard
Trainable decay scheme issue
https://github.com/zhiyongc/GRU-D/blob/9f877f8730322a9b74a4df1bfb9d245cca968b61/GRUD.py#L120
Hi, My understanding of Equation 11 in the GRU-D paper is that the two x values on the right side of this equation are supposed to be different values. I believe mask * x to be correct, as that is multiplying x of the current time step. However, I believe delta_x * x is incorrect, as this x should be the value of x most recently observed prior to the current time step t.
This is worth fixing! Otherwise it's decaying the current imputed value.
@Thartvigsen I noticed this issue not fixed yet in the code!!!!
@DeepWolf90 For now, update the equations in your local version so they are accurate. I'm sure the author is aware of the bug by now. Hopefully it'll be sorted out soon!
Hi @CameronSCarlin, Thanks for posting the flaw in the code. You are definitely correct. The issue is fixed in the updated version. Also, thanks for your reminding and suggestions! @Thartvigsen and @DeepWolf90.