1-billion-word-language-modeling-benchmark
1-billion-word-language-modeling-benchmark copied to clipboard
if the word not in vocab, what should I do? or it always can't happen because the FullTokenizer
Please read https://github.com/ciprian-chelba/1-billion-word-language-modeling-benchmark/blob/master/README.perplexity_and_such and let me know if you still have questions.