CLIP
CLIP copied to clipboard
How to get all the features of the image encoder and text encoder
I want to extract more features than just the 512 dimensional cls token from the CLIP pre-trained model, how can I modify that?