Video-LLaVA Error while loading finetuned model for inferencing.

Error while loading finetuned model for inferencing.

Open tarunmis opened this issue 1 year ago • 2 comments

I am able to pretrain and finetune (using lora) the videollava model using the scripts @ https://github.com/PKU-YuanGroup/Video-LLaVA/tree/main/scripts/v1_5 I used --model_name_or_path 'LanguageBind/Video-LLaVA-7B' for the both scripts- pretrain and finetune using lora.

When I try to run the finetuned model for inferencing; I am getting the following error:-

RuntimeError: Error(s) in loading state_dict for LlavaLlamaForCausalLM:
        size mismatch for model.mm_projector.0.weight: copying a param with shape torch.Size([4096, 1024]) from checkpoint, the shape in current model is torch.Size([2097152, 1]).
        size mismatch for model.mm_projector.2.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).

Feb 25 '24 07:02 tarunmis

我也遇到了同样的问题，请问有人有解决方法了吗？感激 @tarunmis

Mar 02 '24 01:03 zhangye0402

I think with the default finetune_lora.sh script, bf16 field is set to True, meaning the training is done in 16 bits by default. When loading the models for inference, setting load_4bit, load_8bit = False, False will load the model in full before applying LoRA part on top of the base model successfully (at least for my case).

Mar 07 '24 17:03 henryyuanheng-wang

Video-LLaVA Video-LLaVA copied to clipboard

Error while loading finetuned model for inferencing.

Video-LLaVA
Video-LLaVA copied to clipboard