YouTokenToMe
YouTokenToMe copied to clipboard
Support custom tokens
This PR resolves #65 #44, which implemented the custom tokens feature.
Training BPE is intact, and custom tokens are just added to the model file. The tokens are used during encode/decode phase.
Encoding speed is not affected if custom tokens are not provided. Providing custom tokens will make encoding time about 10% longer, which should be acceptable.