open_clip icon indicating copy to clipboard operation
open_clip copied to clipboard

New feature: text/text contrastive

Open rom1504 opened this issue 3 years ago • 3 comments

Can we adapt openclip to be able to train text/text contrastive models?

And beyond that maybe, text/test/image models ?

use case:

  • train pure contrastive text models either for multilingual pairs, or mono lingual
  • use similarity between text pairs and also with images to have a better text understanding while having a good text,image understanding

options:

  • single tower for text1 and text2
  • two towers

It would be nice to find a way to do this without making the code overly complicated.

It goes in a direction of supporting more modalities combination in openclip

A motivation is there are few good models for text,text surprisingly, even though the community on this is quite active

a related idea is image/image as inspired by https://arxiv.org/abs/2212.08045

reference of private models to beat:

  • https://twitter.com/Nils_Reimers/status/1602355249297358849
  • https://twitter.com/OpenAI/status/1603466863370854401

rom1504 avatar Dec 19 '22 20:12 rom1504

Hi @rom1504, I am very interested in this, particularly text/text/image using a single tower for text. If you have started anything I would love to see it, otherwise I will probably try and get something going in the next couple of weeks.

jn2clark avatar Jan 26 '23 22:01 jn2clark

Check #323

rom1504 avatar Jan 26 '23 22:01 rom1504

Amazing! Thank you!

jn2clark avatar Jan 27 '23 00:01 jn2clark