liziming5353

Results 7 issues of liziming5353

Why is the value of “relationCheck” 0.1 In the function "relationPredictCheck" in "main.py"?Is this a random number?

I have read your code but did not find the part of TomBERT(all-text). So this is not in the code?

Why does the data in stage2 and 3 contains pure text Q&A without images or videos?

May I ask how you evaluated on the vqav2 dataset? I couldn't find the annotation file for the test set on the official website.

How is the model without the MM module implemented in the ablation experiment? Is it directly applying the merge algorithm to the entire video?

你好~我看了你的代码之后发现,Fglobal是上一轮batch的输出,这样求得的中心也是上一轮特征向量的中心,不知道我的理解正不正确?如果是正确的,为什么用上一轮的中心?如果不是正确的,那请问怎么由F*更新Fglobal的呢?谢谢~

1. The original video is mkv format. But your code use the format of image-frames. So do we need to preprocess the video first? 2. Is the caption.json the subtitle...