LAVIS icon indicating copy to clipboard operation
LAVIS copied to clipboard

InstructBlip-vicuna13B : llm generate error with position_id's error shape

Open dingtine opened this issue 1 year ago • 7 comments

Code:

ossfs/workspace/LAVIS/lavis/models/blip2_models/modeling_llama.py:529 in forward
position_ids = position_ids.view(-1, seq_length).long()

Log: RuntimeError: shape '[-1, 40]' is invalid for input of size 205

When i print the position_ids' shape, it output twice log. as: position_ids torch.Size([5, 40]) position_ids torch.Size([5, 41]) can you tell me why , and how to solve this problem. my friends. thanks!

dingtine avatar Jun 22 '23 16:06 dingtine

Now ,i found the prepare_inputs_for_generation function in modeling_llama.py inputs_embeds keeps old value and shape, not update.

dingtine avatar Jun 25 '23 17:06 dingtine

same question :(

denny3388 avatar Jun 27 '23 03:06 denny3388

@dingtine I downloaded the vicuna weights again and then fix the problem!! The difference is that I use git clone to download the weights first time, and then got the same error as yours. So I tried to download the weights again through the download link, and the problem got solved!

denny3388 avatar Jun 27 '23 03:06 denny3388

@dingtine I downloaded the vicuna weights again and then fix the problem!! The difference is that I use git clone to download the weights first time, and then got the same error as yours. So I tried to download the weights again through the download link, and the problem got solved!

hi, how to solve this problem, I met same problem, I got vicuna follow this step python -m fastchat.model.apply_delta
--base 你的路径/llama-13b-hf
--target 你的路径/vicuna-13b \ # 合并后为 37G --delta 你的路径/vicuna-13b-delta-v0

wojiaohumaocheng avatar Jun 27 '23 04:06 wojiaohumaocheng

I encountered the same error. One workaround is to enable use_cache=True for LlamaForCausalLM. Concretely, you can edit this line from:

self.llm_model = LlamaForCausalLM.from_pretrained(llm_model, torch_dtype=torch.float16)

to :

self.llm_model = LlamaForCausalLM.from_pretrained(llm_model, torch_dtype=torch.float16, use_cache=True)

albert-cwkuo avatar Jun 27 '23 23:06 albert-cwkuo

This issue may be related to the version of the huggingface. When using different versions, there will be different input processing methods when generating. I have experienced situations where I failed to use version 4.26 but succeeded in using version 4.28 when using blip2.

zdxff avatar Jul 06 '23 03:07 zdxff

I encountered the same error. One workaround is to enable use_cache=True for LlamaForCausalLM. Concretely, you can edit this line from:

self.llm_model = LlamaForCausalLM.from_pretrained(llm_model, torch_dtype=torch.float16)

to :

self.llm_model = LlamaForCausalLM.from_pretrained(llm_model, torch_dtype=torch.float16, use_cache=True)

thx, I met the same problem and fixed it by adding use_cache.

luxuriant0116 avatar Feb 22 '24 12:02 luxuriant0116