transformer-xl
transformer-xl copied to clipboard
why pplx was so high as 1000 when I was training on wt103 dataset
Could you give more information about your experiment setting? I did a quick run just now, and everything was OK.
Could you give more information about your experiment setting? I did a quick run just now, and everything was OK.
I ran on the basis of the parameters of the previous version, using GPU
The result is greatly affected by hyper-parameters