OFA icon indicating copy to clipboard operation
OFA copied to clipboard

What's the difference of data between `Pretraining` and `Vision & Language Tasks`

Open flymark2010 opened this issue 2 years ago • 1 comments

In file dataset, there describes the datasets for Pretraining and Vision & Language Tasks. I found some are overlapped, like RefCOCO, VQAv2 etc. So what's the difference between the overlapped ones?

flymark2010 avatar Jul 26 '22 03:07 flymark2010

@flymark2010 We filter our pretraining data and exclude images that appear in the validation and test sets of downstream tasks to avoid data leakage, so the downstream data (e.g., RefCOCO, VQAv2) used in pretraining is less than the original downstream data.

logicwong avatar Jul 27 '22 07:07 logicwong