deep-learning-with-python-notebooks
deep-learning-with-python-notebooks copied to clipboard
6.3 using LSTM layers instead of GRU layers gives nan, why?
When I use LSTM instead of GRU as the "Going even further" part suggests:
The Stacked LSTM part loss and val loss both are nan:
Why LSTM and GRU different so much, and the nan?
When trying stacked LSTM on GPU, no longer nan, but very large number loss
Why GPU and CPU result different so much.
When change RMSProp to Adam on GPU, the loss change is strange too,
such as 0.8** to 0.7** to 5*****.*** to 4*****.*** to very large number 2***********.****

Hi Jingmouren,
I found the same problem with you. Have you solved the problem "nan" result? Thanks.
Best, Haowen
recurrent_dropout causes this