mPLUG-2 icon indicating copy to clipboard operation
mPLUG-2 copied to clipboard

tran/test split json file for MSR-VTT caption task reproduce

Open naajeehxe opened this issue 1 year ago • 4 comments

Thank you for your wonderful project!

Could you provide the train/test split JSON files for the MSR-VTT caption dataset? I am unable to access the following files:

•	datasets/annotations_all/msrvtt_caption/train.jsonl
•	datasets/annotations_all/msrvtt_caption/test.jsonl

naajeehxe avatar Sep 11 '24 09:09 naajeehxe

From my understanding, you used 1k samples for the test set. To accurately reproduce the results from the paper, could you please provide the sample IDs used for the test set?

naajeehxe avatar Sep 11 '24 10:09 naajeehxe

Yes, me too. While trying to reproduce the results. I couldnt find the files mentioned by @naajeehxe plus the following file: 'datasets/annotations_all/msvd_caption/train.jsonl' It would be great if you could let us know how to generate the same.

idj3tboy avatar Sep 13 '24 11:09 idj3tboy

@idj3tboy I’m not sure if this will be helpful, but I’d like to share how I did it. I downloaded the data from (https://cove.thecvf.com/datasets/839) and used the following two txt files for the train/test split:

•	MSRVTT/videos/train_list_new.txt
•	MSRVTT/videos/test_list_new.txt

As a result, I got 7010 train data and 2990 test data. I’m not exactly sure what the 9k/1k train/test data used in the paper refers to, but I was able to reproduce results similar to the paper using this 7k/3k train/test split.

If you’re in a hurry, it might be a good idea to give it a try!

naajeehxe avatar Sep 16 '24 08:09 naajeehxe

I don't know if this can help you, but I found these train/test splits with 9k/1k, as written in the paper.

https://github.com/albanie/collaborative-experts/blob/master/misc/datasets/msrvtt/README.md

The 1k-A split was produced by the authors of JSFusion [4]. The train/val splits are listed in the files: train_list_jsfusion.txt (9000 videos) and val_list_jsfusion.txt (1000 videos)

thanhhff avatar Feb 02 '25 14:02 thanhhff