Fashion_Captioning icon indicating copy to clipboard operation
Fashion_Captioning copied to clipboard

Dataset Details Mismatch

Open gourango01 opened this issue 3 years ago • 0 comments

Is dataset used in the paper different from the preprocessed dataset provided on google drive? Or Am I missing something?
Preprocessed data from the google drive: TRAIN: 888293 VAL: 19915 TEST: 101225

From paper Section 5.1: It contains 993K images and 130K descriptions, and we split the whole dataset, with approximately 794K image-description pairs for training, 99K for validation, and the remaining 100K for test.

gourango01 avatar May 03 '21 10:05 gourango01