coot-videotext
coot-videotext copied to clipboard
COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
Hi, thank you for sharing your work and congratulations on the paper! I am trying to use COOT to create video descriptions for videos that aren't in ActivityNet. I saw...
I am doing violence detection using video captioning. If I give your model a number of videos containing some type of violence will it be able to tell that in...
Hi, Do you have a sample inference code to load the model, pre-process video and text, and get the similarity score ? Thanks !
Hello, thank you for publishing your work. I'm a bit lost among the many scripts and models and I was wondering, if it's possible to create a single script which...
Hi, there. Thank you for your great work on this project. I have been trying to run your model on the ActivityNet Caption as described in your documentation. However, I...