tevatron icon indicating copy to clipboard operation
tevatron copied to clipboard

InfoLOOB Loss

Open raunak-agarwal opened this issue 3 years ago • 5 comments

Hi, Thanks for the great work!

Do you think InfoLOOB (formulation here, implementation here) would be a good addition to this library? Seems like it outperforms InfoNCE in an image-text setting; I thought it might be worth experimenting with it on purely-text-based IR tasks

raunak-agarwal avatar May 12 '22 03:05 raunak-agarwal

Thanks for your suggestion! This looks interesting to me! I think maybe it is time for me and my colleagues to think about incorporating adding additional forms of loss into Tevatron.

In terms of development, I think we will survey a collection of interesting losses and add them together in a single PR. We are open to suggestions of other loss functions to include.

luyug avatar May 24 '22 17:05 luyug

AFAIK, the latent space of CLOOB seems to be aligning text and image modalities much better than CLIP. Below are two plots i saw someone post on EleutherAI's discord where they created UMAP's on a small sample of image-text pairs (CLIP on top and CLOOB below)

CLIP on top and CLOOB below

Let me know if integrating this is in the works. It would be a great addition to the library. I can also ping here if I come across other interesting losses.

raunak-agarwal avatar May 25 '22 08:05 raunak-agarwal

One question, do you have any expectation on what this loss will do to text (text only setup)?

luyug avatar May 25 '22 19:05 luyug

My expectation is that in case of two tower setups, we might see better aligned embeddings. (I don't think this approach is meant for single tower setups)

Other than that, it's hard to say beforehand how much of an improvement we can expect.

raunak-agarwal avatar May 26 '22 13:05 raunak-agarwal

I see. We will triage this through the weekend.

luyug avatar May 26 '22 13:05 luyug