darts icon indicating copy to clipboard operation
darts copied to clipboard

Problem on reproducing results of RNN on PTB

Open tonystark940501 opened this issue 6 years ago • 2 comments

Hi @quark0 , thanks for releasing the code. Really enjoy your paper. Here I have a problem and hope to get your help. I run the search code with 5 different random seeds, including the default seed and get 5 different RNN architectures. However, when I train these 5 architectures respectively from scratch, I get test ppl of 57.16, 61, 57.81, 60.99 and 57.53. None of them get a test ppl around 56. Is there anything I missed to get a robust result of 56.1 or 55.8 in the paper?

tonystark940501 avatar Dec 26 '18 05:12 tonystark940501

You probably need to adjust the hyperparameters for the final evaluation. The default hyperparameters were tuned wrt the provided genotype but are likely suboptimal for the new architectures.

quark0 avatar Dec 26 '18 07:12 quark0

@quark0 Thanks for your response! Are there any suggestions for tunning hyperparams? Like what hyper parameters needed to be tuned, and the range. I see there`re a lot of hyper parameters needed to be tuned, including four dropout. Is it hard or expensive to do so?

tonystark940501 avatar Dec 27 '18 07:12 tonystark940501