CoOp
CoOp copied to clipboard
About input of text
Thanks for your great job!
I want to ask why the input is not (image, text) at forward function, such as output = self.model(image, text)
.
And what is the scheme of matching text logits and image logits?