joeshmoe0112358
joeshmoe0112358
To summarize my findings: 1. The numbers in the WikiText-2 column should be identical to the numbers in the WikiText-103 column because the val/test splits are identical between datasets. However,...
I am not actually sure why it's not working for you. I just pulled all changes so I am on the latest version of the repo and then I just...
I have just updated it and added the changes that you have suggested. Please let me know if it now works for you or if you are still finding problems...
I am looking into this now and I will update you or ask questions as needed. Thanks for the guidance.
Okay after reading on this carefully here are my takeaways: - I am seeing a lot of different numbers on perplexity being reported across the board and many people having...
Okay there are some contradictions caused by Alec's table and information I think. **Claims/Information:** 1. Alec claims to have produced a table of values for perplexity scores evaluated on the...
> i.e. 1.17 (in what i assume is the nll), so we're not even close to the right order of magnitude. The formatting of the table is broken, if you...
Yes, I am highly skeptical of the reliability of Alec's table because 37.50 is exactly what the GPT-2 paper reports and they used very different methods (in fact, the first...