CLIP
CLIP copied to clipboard
Are there public datasets in your WIT dataset ?
Does the WIT dataset contain images / text from public datasets such as COCO, DIOR, BRATs, DOTA ... ????
Without this knowledge, current works using CLIP are undermined by the assumption that there is a data leak issue due to your training on an unspecified dataset, thus hindering research based on CLIP.
The abstract of the CLIP paper says:
a dataset of 400 million (image, text) pairs collected from the internet
The COCO paper says:
we collected images from Flickr
and section 3.2 reads like they also used Google and Bing image search.
So yes, there might be data contamination. Whether that actually matters depends on the problem being solved.