Giovanni Puccetti

Results 108 comments of Giovanni Puccetti

@rwightman @vturrisi this was the intention, because the embedding that makes sense to use for contrastive downstream tasks with coca is the one output by the pooler. The only detail...

@rwightman ah now I see, I made a mistake reading the paper, I thought it worked how I wrote it.

@rwightman ok, indeed I was thinking about it, I believe that two poolers or one with one extra query are the same, except for the shared linear layer inside MultiHeadAttention

> I don't see how they'd be equivalent with the softmax there... @rwightman maybe I am just in denial, however, each row of the attention is one query dot product...

@vturrisi I am planning on doing a PR that should improve huggingface integration and a few other changes, I will add in that as soon as I start working on...

Hi @John-Kieron, is this on the latest version?

@John-Kieron I can't manage to replicate this, can you share some more info, did you make any changes to the code?

Hi @vedantroy for the different special tokens I don´t know if there is a specific reason, for the exclamation marks, the reason they are there is that the tokenizer uses...

Sure, I will try and reuse as much as I can, for now it is mostly copied from the coca-pytorch repo, will probably ask for some help while I move...

@rom1504 I will reuse the visual_model from open_clip, however in coca-pytorch the transformer layer for the text model are different from the regular ones, feed_forward and attention are parallel, do...