InternVideo
InternVideo copied to clipboard
About video dataset curation pipeline
@shepnerd @leexinhao
Hi, huge thanks to the authors for releasing this amazing project! I'm looking forward to using your model and data in my own research.
Regarding video-text data curation, it seems you re-captioned the original data using VideoChat2 and Gemini and re-annotated the videos for instruction fine-tuning. To construct a same video-text dataset with yours, could you please share the detailed recaptioning & annotating pipeline and code, demonstrated in the figure below? It must be helpful for my research. Thank you so much!
It's very simple, just prompt VideoChat2/Gemini to describe the video in details.