keras-preprocessing
keras-preprocessing copied to clipboard
how keras.preprocessing.text.Tokenizer processing oov_token and predefined special token?
I try to use Tokenizer to handle string input. "oov_token" param is given "<UNK>" when Tokenizer was initializing. However, oov_token's corresponding index is more than num_words. This index can't be used directly in embedding_lookup by token index. Another question is how to use predefined words with Tokenizer , such as <GO> <EOS> <PAD> .