LLaVA [Usage] LoRA finetuned weights provided for vicuna-13b-v1.3 gives NaN / inf error when performing inference on COCO-2014 questions after merging LoRA weights

Describe the issue

Issue:

We are trying to perform inference on the LoRA weights provided for vicuna-13b-v1.3 here. As mentioned by @haotian-liu in issue #245, we performing the merging step on the LoRA weights using the following command:

python merge_lora_weights.py \
    --model-path hf_checkpoints/llava-v1-0719-336px-lora-vicuna-13b-v1.3 \
    --model-base LLaVA/checkpoints/fastchat_llama-vicuna-v1-3-13b \
    --save-model-path hf_checkpoints/llava-v1-0719-336px-lora-vicuna-13b-v1.3-MERGE

After this, we perform the inference on 90 samples of COCO-2014 as mentioned in the paper using:

python -m llava.eval.model_vqa \
    --model-path hf_checkpoints/llava-v1-0719-336px-lora-vicuna-13b-v1.3-MERGE \
    --question-file \
    LLaVA/playground/data/coco2014_val_qa_eval/qa90_questions.jsonl \
    --image-folder \
    LLaVA/coco/coco_dataset/val2014 \
    --answers-file \
    LLaVA/model_inference_testing/coco/coco_val2014_answers-HF-vicuna-v1-3-13b-prompt-v1-test-merge.jsonl

This inference gives the following Error Log :

  0%|                                                                                                                                         | 0/90 [00:00<?, ?it/s]/home/anaconda3/envs/llavacuda6/lib/python3.10/site-packages/transformers/generation/utils.py:1270: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation )
  warnings.warn(
  0%|                                                                                                                                         | 0/90 [00:33<?, ?it/s]
Traceback (most recent call last):
  File "/home/anaconda3/envs/llavacuda6/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/anaconda3/envs/llavacuda6/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/workspace/cgy/LLAVA/LLaVA/llava/eval/model_vqa.py", line 112, in <module>
    eval_model(args)
  File "/home/workspace/cgy/LLAVA/LLaVA/llava/eval/model_vqa.py", line 66, in eval_model
    output_ids = model.generate(
  File "/home/anaconda3/envs/llavacuda6/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/anaconda3/envs/llavacuda6/lib/python3.10/site-packages/transformers/generation/utils.py", line 1588, in generate
    return self.sample(
  File "/home/anaconda3/envs/llavacuda6/lib/python3.10/site-packages/transformers/generation/utils.py", line 2678, in sample
    next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

The python code we use to generate our model-base in merge_lora_weights.py is as follows :

python -m fastchat.model.apply_delta \
    --base huggyllama/llama-13b \
    --target checkpoints/fastchat_llama-vicuna-v1-3-13b \
    --delta lmsys/vicuna-13b-v1.3

Interestingly, the same procedure when done for the LoRA-Merged weights returns :

all : 76.3
complex : 90.0
conv : 75.4
detail : 63.4

implying that merge_lora_weights.py either has some issue, or the provided LoRA weights have some issue, or the model-base is faulty.

Kindly suggest fixes for whatever is the reason for this error.

Aug 30 '23 18:08 DefUs3r

I got the same error through the preview lora inference steps. link 截屏2023-09-13 09 32 28

Sep 13 '23 01:09 wanghao-cst

I also got the same error when using my own fine-tuned model to inference.

Sep 15 '23 01:09 Cubism-star

Describe the issue

Issue:

We are trying to perform inference on the LoRA weights provided for vicuna-13b-v1.3 here. As mentioned by @haotian-liu in issue #245, we performing the merging step on the LoRA weights using the following command:

python merge_lora_weights.py \
    --model-path hf_checkpoints/llava-v1-0719-336px-lora-vicuna-13b-v1.3 \
    --model-base LLaVA/checkpoints/fastchat_llama-vicuna-v1-3-13b \
    --save-model-path hf_checkpoints/llava-v1-0719-336px-lora-vicuna-13b-v1.3-MERGE

After this, we perform the inference on 90 samples of COCO-2014 as mentioned in the paper using:

python -m llava.eval.model_vqa \
    --model-path hf_checkpoints/llava-v1-0719-336px-lora-vicuna-13b-v1.3-MERGE \
    --question-file \
    LLaVA/playground/data/coco2014_val_qa_eval/qa90_questions.jsonl \
    --image-folder \
    LLaVA/coco/coco_dataset/val2014 \
    --answers-file \
    LLaVA/model_inference_testing/coco/coco_val2014_answers-HF-vicuna-v1-3-13b-prompt-v1-test-merge.jsonl

This inference gives the following Error Log :

  0%|                                                                                                                                         | 0/90 [00:00<?, ?it/s]/home/anaconda3/envs/llavacuda6/lib/python3.10/site-packages/transformers/generation/utils.py:1270: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation )
  warnings.warn(
  0%|                                                                                                                                         | 0/90 [00:33<?, ?it/s]
Traceback (most recent call last):
  File "/home/anaconda3/envs/llavacuda6/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/anaconda3/envs/llavacuda6/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/workspace/cgy/LLAVA/LLaVA/llava/eval/model_vqa.py", line 112, in <module>
    eval_model(args)
  File "/home/workspace/cgy/LLAVA/LLaVA/llava/eval/model_vqa.py", line 66, in eval_model
    output_ids = model.generate(
  File "/home/anaconda3/envs/llavacuda6/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/anaconda3/envs/llavacuda6/lib/python3.10/site-packages/transformers/generation/utils.py", line 1588, in generate
    return self.sample(
  File "/home/anaconda3/envs/llavacuda6/lib/python3.10/site-packages/transformers/generation/utils.py", line 2678, in sample
    next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

The python code we use to generate our model-base in merge_lora_weights.py is as follows :

python -m fastchat.model.apply_delta \
    --base huggyllama/llama-13b \
    --target checkpoints/fastchat_llama-vicuna-v1-3-13b \
    --delta lmsys/vicuna-13b-v1.3

Interestingly, the same procedure when done for the LoRA-Merged weights returns :

all : 76.3
complex : 90.0
conv : 75.4
detail : 63.4

implying that merge_lora_weights.py either has some issue, or the provided LoRA weights have some issue, or the model-base is faulty.

Kindly suggest fixes for whatever is the reason for this error.

Hi, have you fixed the issue?

Sep 21 '23 02:09 wanghao-cst

Describe the issue

Issue: We are trying to perform inference on the LoRA weights provided for vicuna-13b-v1.3 here. As mentioned by @haotian-liu in issue #245, we performing the merging step on the LoRA weights using the following command:

python merge_lora_weights.py \
    --model-path hf_checkpoints/llava-v1-0719-336px-lora-vicuna-13b-v1.3 \
    --model-base LLaVA/checkpoints/fastchat_llama-vicuna-v1-3-13b \
    --save-model-path hf_checkpoints/llava-v1-0719-336px-lora-vicuna-13b-v1.3-MERGE

After this, we perform the inference on 90 samples of COCO-2014 as mentioned in the paper using:

python -m llava.eval.model_vqa \
    --model-path hf_checkpoints/llava-v1-0719-336px-lora-vicuna-13b-v1.3-MERGE \
    --question-file \
    LLaVA/playground/data/coco2014_val_qa_eval/qa90_questions.jsonl \
    --image-folder \
    LLaVA/coco/coco_dataset/val2014 \
    --answers-file \
    LLaVA/model_inference_testing/coco/coco_val2014_answers-HF-vicuna-v1-3-13b-prompt-v1-test-merge.jsonl

This inference gives the following Error Log :

  0%|                                                                                                                                         | 0/90 [00:00<?, ?it/s]/home/anaconda3/envs/llavacuda6/lib/python3.10/site-packages/transformers/generation/utils.py:1270: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation )
  warnings.warn(
  0%|                                                                                                                                         | 0/90 [00:33<?, ?it/s]
Traceback (most recent call last):
  File "/home/anaconda3/envs/llavacuda6/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/anaconda3/envs/llavacuda6/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/workspace/cgy/LLAVA/LLaVA/llava/eval/model_vqa.py", line 112, in <module>
    eval_model(args)
  File "/home/workspace/cgy/LLAVA/LLaVA/llava/eval/model_vqa.py", line 66, in eval_model
    output_ids = model.generate(
  File "/home/anaconda3/envs/llavacuda6/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/anaconda3/envs/llavacuda6/lib/python3.10/site-packages/transformers/generation/utils.py", line 1588, in generate
    return self.sample(
  File "/home/anaconda3/envs/llavacuda6/lib/python3.10/site-packages/transformers/generation/utils.py", line 2678, in sample
    next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

The python code we use to generate our model-base in merge_lora_weights.py is as follows :

python -m fastchat.model.apply_delta \
    --base huggyllama/llama-13b \
    --target checkpoints/fastchat_llama-vicuna-v1-3-13b \
    --delta lmsys/vicuna-13b-v1.3

Interestingly, the same procedure when done for the LoRA-Merged weights returns :

all : 76.3
complex : 90.0
conv : 75.4
detail : 63.4

implying that merge_lora_weights.py either has some issue, or the provided LoRA weights have some issue, or the model-base is faulty. Kindly suggest fixes for whatever is the reason for this error.

Hi, have you fixed the issue?

No this is not yet fixed.

Oct 07 '23 18:10 DefUs3r

how did you download the dataset coco/coco_dataset/val2014?