LLaVA Trying to inference using the lora adapter (from fine-tuning) and the base model and not working

Describe the issue

Issue: I have fine-tune the liuhaotian/llava-v1.5-7b model and I have the output in ./checkpoints/llava-v1.5-7b-task-lora. I try to do the inference with only the base model and it works, but when I try to do inference including the lora adapter it gives me error:

Code for inference with lora adpter: python -m llava.serve.cli --model-path ./checkpoints/llava-v1.5-7b-task-lora --model-base liuhaotian/llava-v1.5-7b --image-file ./playground/data/textvqa/train_images/00a108e5e2160b20.jpg --load-4bit

I put the adapter as the model-path and the base model as model-base. The error code is below, any idea what I am doing wrong?. Thanks in advance.

[2024-01-23 12:05:56,258] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect) Loading LLaVA from base model... Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:07<00:00, 3.54s/it] Loading additional LLaVA weights... Traceback (most recent call last): File "/home/jseuma/miniconda3/envs/llava/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/jseuma/miniconda3/envs/llava/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/jseuma/LLaVA/llava/serve/cli.py", line 124, in main(args) File "/home/jseuma/LLaVA/llava/serve/cli.py", line 32, in main tokenizer, model, image_processor, context_len = load_pretrained_model(args.model_path, args.model_base, model_name, args.load_8bit, args.load_4bit, device=args.device) File "/home/jseuma/LLaVA/llava/model/builder.py", line 75, in load_pretrained_model model.load_state_dict(non_lora_trainables, strict=False) File "/home/jseuma/miniconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for LlavaLlamaForCausalLM: size mismatch for model.mm_projector.0.weight: copying a param with shape torch.Size([4096, 1024]) from checkpoint, the shape in current model is torch.Size([2097152, 1]). size mismatch for model.mm_projector.2.weight: copying a param with shape torch.Size([4096, 4096]) from checkpoint, the shape in current model is torch.Size([8388608, 1]).

Jan 23 '24 11:01 JSeuma

I am also getting the same error when trying to load the Finetuned Lora checkpoint using the following code:


tokenizer, model, image_processor, context_len = load_pretrained_model(
            "/data/image_captioning_data/checkpoints_2/llava-v1.5-7b-task-lora/checkpoint-5",
            model_name="liuhaotian/llava-v1.5-7b-lora-finetuned",
            model_base="liuhaotian/llava-v1.5-7b",
            load_8bit=False, 
            load_4bit=True)```

Mar 01 '24 09:03 AnjanGiri

I also encountered the same problem, have you solved it? thank you.

Mar 11 '24 14:03 1716757342

You need to first meger base model and lora model with the following code.

python scripts/merge_lora_weights.py
--model-path /path/to/llava-v1.5-13b-lora
--model-base LLaVA_train
--save-model-path merge_model

Then use the post-meger run and you're done.

Mar 11 '24 15:03 1716757342

You need to first meger base model and lora model with the following code.

python scripts/merge_lora_weights.py --model-path /path/to/llava-v1.5-13b-lora --model-base LLaVA_train --save-model-path merge_model

Then use the post-meger run and you're done.

I merged,but this model can't gengrate anything.

Mar 11 '24 16:03 zhyhome