awd-lstm-lm Dictionary - handling OOV tokens

Dictionary - handling OOV tokens

Open chiphuyen opened this issue 6 years ago • 1 comments

I was looking into the data.py and saw that the dictionary consists of all tokens in train, val, and test files. I'm wondering if adding unseen tokens in val/test files to the dictionary will affect the testing in any way? Thanks!

Jul 19 '18 19:07 chiphuyen

Agreed. It could be okay for the benchmark dataset but seems problematic in a real scenario.

Aug 31 '18 09:08 gyuwankim

awd-lstm-lm awd-lstm-lm copied to clipboard

Dictionary - handling OOV tokens

awd-lstm-lm
awd-lstm-lm copied to clipboard