InternVideo
InternVideo copied to clipboard
extract multi-modal features using InternVideo2
Hi InternVideo2 team!
Could you please share a code about how you extract the multi-modal features? I'd like to use the models to extract feature of my own dataset.
Thanks for your guidance!