Clover
Clover copied to clipboard
Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model (CVPR2023)
Hi, I want to train the model using the MSR-VTT dataset. And it tells me that I need a pkl file but I can only find the mp4 and txt...
Hi, Do you have a sample inference code to load the model, pre-process video and text, and get the similarity score ? Thanks !
Is the author still updating? This includes weight files and instructions for using the code. Thanks.