How can I get the clip-vit-large-patch14-448
Hello, Your project is interesting. But the link you gave in the readme is for clip-vit-large-patch14-224, and I can't find clip-vit-large-patch14-448 on the huggingface, can you updata the link for the clip-vit-large-patch14-448?
Thanks for your attention. There is no original clip-vit-large-patch14-448 on the hugging face. We employed a positional embedding interpolation to adapt the original 224x clip-vit to support an input resolution of 448.
Thank you very much for the information. I have a question: do we need to implement the positional embedding interpolation ourselves in order to adapt the original clip-vit model, which supports a 224x input, to support an input resolution of 448x? Thank you for your response!
In the paper, it is mentioned that all modules are trained. Does it include the CLIP model? If so, could you please provide a fine-tuned CLIP model? Unless, it is difficult to reproduce the results.
Thanks in advance!