Video-LLaMA icon indicating copy to clipboard operation
Video-LLaMA copied to clipboard

Difference between the Self-implemented BLIP2 vs HF version?

Open MoonBlvd opened this issue 8 months ago • 0 comments

Hi, thanks for the great work! While reading the code, I noticed that you have used self-implemented version of BLIP and BERT etc. as oppose to directly importing the corresponding HF modules with the same names, for example BertLMHeadModel. Is this because you needed to add modifications to the models to get better performance?

MoonBlvd avatar Oct 10 '23 21:10 MoonBlvd