InternVideo
InternVideo copied to clipboard
Could you please provide more examples to do inference on the different tasks in the paper?
Such as temporal grounding on QVHighlight and Charade-STA
+1
Same question here. I have tried getting an output in frames or seconds and both seem to perform poorly.
Any update? @shepnerd @buaalyx I tried to use InternVideo2/demo, since they're using the pretrained Bert, the feature dim is 512.
+1