CLIP
CLIP copied to clipboard
Question: Do the input tokens have to come from clip.tokenize(str) when using the pretrained model?
Can I use a different method to tokenize the input prompt and still get a proper prediction or must I use the clip.tokenize(str) method? I'm wondering if I can, for example, use Hugging Face's Bert tokenizer or SentencePiece?
My intuition says that I must use clip.tokenize(str) since that's the set of tokens that the model was trained with.