joeshmoe0112358 comments

Results 8 comments of


                                            joeshmoe0112358

Adding GPT2 model evaluation on WikiText-103 with optional preprocessing in dev/model-eval/

To summarize my findings: 1. The numbers in the WikiText-2 column should be identical to the numbers in the WikiText-103 column because the val/test splits are identical between datasets. However,...

Adding WikiText-103 dataset preprocessing and tokenization

I am not actually sure why it's not working for you. I just pulled all changes so I am on the latest version of the repo and then I just...

Adding WikiText-103 dataset preprocessing and tokenization

I have just updated it and added the changes that you have suggested. Please let me know if it now works for you or if you are still finding problems...

Adding WikiText-103 dataset preprocessing and tokenization

I am looking into this now and I will update you or ask questions as needed. Thanks for the guidance.

Adding WikiText-103 dataset preprocessing and tokenization

Okay after reading on this carefully here are my takeaways: - I am seeing a lot of different numbers on perplexity being reported across the board and many people having...

Adding WikiText-103 dataset preprocessing and tokenization

Okay there are some contradictions caused by Alec's table and information I think. **Claims/Information:** 1. Alec claims to have produced a table of values for perplexity scores evaluated on the...

WikiText103 eval, attempt to reproduce Alec table posted on Reddit

> i.e. 1.17 (in what i assume is the nll), so we're not even close to the right order of magnitude. The formatting of the table is broken, if you...

WikiText103 eval, attempt to reproduce Alec table posted on Reddit

Yes, I am highly skeptical of the reliability of Alec's table because 37.50 is exactly what the GPT-2 paper reports and they used very different methods (in fact, the first...