flex-dm
flex-dm copied to clipboard
Embedding dimensions
Hello , I was a bit confused as the supplementary material and paper describes that image and text features are extracted in 768 dimension using CLIP , however looking at the code the embeddings are described as having 512 dimensional shape. Is there something I'm missing or is there a way you are downscaling from 768 to 512 dimension