faster-rnnlm icon indicating copy to clipboard operation
faster-rnnlm copied to clipboard

Learning rate for 1B corpus

Open jhlau opened this issue 8 years ago • 0 comments

Hi, I am training a wikipedia corpus with 1B tokens, using sigmoid/gru with hidden count 1/2/3. The initial learning rate of 0.01 gave me pretty good results when I was working with 100M wikipedia, but for the 1B corpus after training a couple epochs both sigmoid/gru are starting to give me NaN entropy. Just curious, what are the learning rate that you used for the 1B benchmark corpus? I am now setting it to 0.001 and hopefully the gradients won't explode.

jhlau avatar Apr 29 '16 16:04 jhlau