InternVideo icon indicating copy to clipboard operation
InternVideo copied to clipboard

InternVid dataset download

Open jqsun98 opened this issue 1 year ago • 3 comments

I'd like to know how to download InternVid-10M-FLT dataset. It seems both the hugging face and OpenDataLab can not access to original videos for downloading.

jqsun98 avatar Feb 26 '24 03:02 jqsun98

Hi, thank you for your attention to our work. Due to copyright reasons, we are unable to directly provide you with the download link for the video in the near future. You may consider using tools such as yt-dlp or youtube-dl for downloading.

yinanhe avatar Feb 29 '24 02:02 yinanhe

Thanks for your advice!

According to your suggestions, I have found the "InternVid-10M-FLT-INFO.jsonl" file, which contains the YoutubeID, Start_timestamp and End_timestamp. Then I convert timestamp in the form of seconds. Next, I try to download the video clip using yt-dlp with the following command:

yt-dlp--download-sections "10.342-15.132" -f best -o "output.mp4" https://www.youtube.com/watch?v=pDq9UzfCtGw

But it shows that "Cannot match chapters since chapter information is unavailable". I'd like to know how to solve this issue.

If I download all videos first and then segment into the specific clip, it will consume a sea of resources during the first step, which seems unnecessary.

jqsun98 avatar Mar 26 '24 10:03 jqsun98

Try something like this: yt-dlp.exe --download-sections *00:00:03.160-00:00:06.600 -S vcodec:h264,res,acodec:m4a https://www.youtube.com/watch?v=2iCfz3Wk5ds -o snippet.mp4

adeobootpin avatar Apr 07 '24 03:04 adeobootpin