PLBART
PLBART copied to clipboard
Support for FastTokenizer in huggingface
Hello, I found there is no a corresponding PLBartTokenizerFast in huggingface, do you have a plan to implement a fast version tokenizer?
In fact, I need to call the word_ids() function of fast tokenizer to get the list indicating the original word corresponding to each tokenized token.
word_ids = tokenized_inputs.word_ids(batch_index=i)
Or do you have any ways to calculate the original word index corresponding to each tokenized token?
Thank you very much!