sequence_tagging
sequence_tagging copied to clipboard
Why use training, validation and test set to build vocabulary in build_data.py ? Isn't it information leak ?
I'm not sure but I think including test set to build vocabulary is spilling information from test set to training set. If using test set was OK, then why using only training set to build character set in build_data.py ?