spaCy
spaCy copied to clipboard
Sentencepiece base Language
feature request:
Sentencepiece is the tokenizer used in XLNet.
I think if Language
tokenize text with sentencepiece
, the alignment process can be skipped and it make model efficient.
I added this functionality in camphr.
Document : https://camphr.readthedocs.io/en/latest/notes/sentencepiece.html