InternVideo icon indicating copy to clipboard operation
InternVideo copied to clipboard

Simple question: What are the public datasets included in InternVid-200M?

Open jong980812 opened this issue 1 year ago • 1 comments

In "InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation," I would like to use ViCLIP-B-16 on InternVid-200M. Does this dataset ( or InternVid-FLT) contain videos from Kinetics400, SSV2, and UCF101? It is not clearly written in your paper whether only the labels were referred to, or if the videos were also included. I am curious to know

jong980812 avatar Apr 15 '24 16:04 jong980812

It does not contain videos from your mentioned datasets. We clearified it in Sec. 3.1 data curation as follows:"We ensure the uniqueness of our dataset by creating a database of YouTube video IDs and excluding any videos already present in publicly available datasets (released prior to April 2023)."

shepnerd avatar Apr 16 '24 02:04 shepnerd