LLaVA icon indicating copy to clipboard operation
LLaVA copied to clipboard

[Usage] Error in load liuhaotian/llava-v1.6-34b checkpoint.

Open UknowSth opened this issue 10 months ago • 2 comments

Describe the issue

Issue: When loading weights for llava-v1.6-34b, it says model parameter mismatch.

Command:

    model_path = "liuhaotian/llava-v1.6-34b"
    with warnings.catch_warnings():
        warnings.simplefilter("ignore")  # Pytorch non-meta copying warning fills out the console
        tokenizer, model, image_processor, context_len = load_pretrained_model(
            model_path=model_path,
            model_base=None,
            model_name=get_model_name_from_path(model_path),
        )

Log:

Loading checkpoint shards:   0%|          | 0/15 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/mnt/petrelfs/wuxiaoxue/Llava/caption_llava.py", line 359, in <module>
    main(args)
  File "/mnt/petrelfs/wuxiaoxue/.conda/envs/panda/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/petrelfs/wuxiaoxue/Llava/caption_llava.py", line 269, in main
    tokenizer, model, image_processor, context_len = load_pretrained_model(
  File "/mnt/petrelfs/wuxiaoxue/Llava/llava/model/builder.py", line 114, in load_pretrained_model
    model = LlavaLlamaForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, **kwargs)
  File "/mnt/petrelfs/wuxiaoxue/.conda/envs/panda/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2795, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/mnt/petrelfs/wuxiaoxue/.conda/envs/panda/lib/python3.9/site-packages/transformers/modeling_utils.py", line 3123, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/mnt/petrelfs/wuxiaoxue/.conda/envs/panda/lib/python3.9/site-packages/transformers/modeling_utils.py", line 698, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/mnt/petrelfs/wuxiaoxue/.conda/envs/panda/lib/python3.9/site-packages/accelerate/utils/modeling.py", line 348, in set_module_tensor_to_device
    raise ValueError(
ValueError: Trying to set a tensor of shape torch.Size([1024, 7168]) in "weight" (which has shape torch.Size([7168, 7168])), this look incorrect.

I downloaded model weights from https://hf-mirror.com/liuhaotian/llava-v1.6-34b/tree/main, and it dosen't work. I wonder if the model weights is wrong or something wrong with code version (I use llava-1.2.0), or package version (I use python 3.9, torch 1.12.1+cu113, torchvision 0.13.1+cu113).

UknowSth avatar Apr 09 '24 09:04 UknowSth

Is there a solution to this problem

Athicbliss avatar May 10 '24 02:05 Athicbliss

can anyone help?

YangYangTx avatar May 14 '24 07:05 YangYangTx

I also met this problem when I try to use llama3-llava-next-8b

RichardSunnyMeng avatar Jul 08 '24 13:07 RichardSunnyMeng

Maybe you can check whether you load the right model. I solve it by it.

RichardSunnyMeng avatar Jul 09 '24 02:07 RichardSunnyMeng