CLIP icon indicating copy to clipboard operation
CLIP copied to clipboard

Question about simple_tokenizer.

Open Mypathissional opened this issue 2 years ago • 0 comments

Hey, I have noticed that the code for tokenization/simple_tokenizer.py) is very similiar to Gpt-2 encoding except that the vocabulary contains tokens ending with "". What is the meaning of ""?

Would be grateful for any help

Mypathissional avatar Feb 17 '23 11:02 Mypathissional