Wenhao Wu comments

Results 9 comments of


                                            Wenhao Wu

lda_0.1.pt等文件

Transferring visual statistic knowledge: 对于 Kinetics-400 数据集的实验，我们从每个类别中采样了 60 个视频，大约占训练数据的 10%。这些视频都直接送给CLIP的visual encoder来得到video embeddings。利用这些embeddings和其对应的label，我们可以用LDA得到LDA coefficient，再用其作为classifier. Transferring textual semantic knowledge: 用BERT直接对category names抽取text embeddings，并作为classifier.

lda_0.1.pt等文件

这里的4x3 view仅存在于test阶段，在运行指令后加入相关后缀即可，如--test_crops 3 --test_clips 4，和config中的num_sample无关啦。 `sh scripts/run_test.sh configs/k400/k400_train_rgb_vitb-32-f8.yaml exp/k400/ViT-B/32/f8/last_model.pt --test_crops 3 --test_clips 4 `

关于多模态融合以及结果复现问题

感谢对我们工作的兴趣。 1. 不清楚您指的是什么数据集上的结果？ 2. 关于logit_scale请参考CLIP官方代码https://github.com/openai/CLIP/blob/a1d071733d7111c9c014f024669f959182114e33/clip/model.py#L295

Question about implementation details.

> Hello, I admit that this is a good job. However, in the code, you set batch_size=256, but the paper states that it is 128 ( Maybe the version of...

Looking forward to integrating more mllm, such as instructblip, minigpt4-v2

Yes, that's natural. I've already been experimenting with more MLLMs and will be releasing the results recently.

Looking forward to integrating more mllm, such as instructblip, minigpt4-v2

For LLaVA-1.6, it uses both base features (336x336 resolution) and higher resolution features. To perform inference similar to 1.5, you only need to use the base features to avoid introducing...

Looking forward to integrating more mllm, such as instructblip, minigpt4-v2

I have just updated the code for LLaVA-1.6. Just one line. You can check it out :)

Looking forward to integrating more mllm, such as instructblip, minigpt4-v2

Of course! I'm getting married next week, so I plan to update arXiv with these results in early June after that.

checkpoint site crashing

Thank you for your reminder. I have get all the checkpoint links in https://unisyd-my.sharepoint.com/:f:/g/personal/wenhao_wu_sydney_edu_au/EieZBg9a40VAhSIVl6ovAIIBaCuzYamkfE1dMn6MxjjwGg?e=JcqYk4