LAVIS
LAVIS copied to clipboard
InstructBlip-vicuna13B : llm generate error with position_id's error shape
Code:
ossfs/workspace/LAVIS/lavis/models/blip2_models/modeling_llama.py:529 in forward
position_ids = position_ids.view(-1, seq_length).long()
Log: RuntimeError: shape '[-1, 40]' is invalid for input of size 205
When i print the position_ids' shape, it output twice log. as: position_ids torch.Size([5, 40]) position_ids torch.Size([5, 41]) can you tell me why , and how to solve this problem. my friends. thanks!
Now ,i found the prepare_inputs_for_generation function in modeling_llama.py inputs_embeds keeps old value and shape, not update.
same question :(
@dingtine I downloaded the vicuna weights again and then fix the problem!! The difference is that I use git clone to download the weights first time, and then got the same error as yours. So I tried to download the weights again through the download link, and the problem got solved!
@dingtine I downloaded the vicuna weights again and then fix the problem!! The difference is that I use git clone to download the weights first time, and then got the same error as yours. So I tried to download the weights again through the download link, and the problem got solved!
hi, how to solve this problem, I met same problem, I got vicuna follow this step python -m fastchat.model.apply_delta
--base 你的路径/llama-13b-hf
--target 你的路径/vicuna-13b \ # 合并后为 37G
--delta 你的路径/vicuna-13b-delta-v0
I encountered the same error. One workaround is to enable use_cache=True
for LlamaForCausalLM
. Concretely, you can edit this line from:
self.llm_model = LlamaForCausalLM.from_pretrained(llm_model, torch_dtype=torch.float16)
to :
self.llm_model = LlamaForCausalLM.from_pretrained(llm_model, torch_dtype=torch.float16, use_cache=True)
This issue may be related to the version of the huggingface. When using different versions, there will be different input processing methods when generating. I have experienced situations where I failed to use version 4.26 but succeeded in using version 4.28 when using blip2.
I encountered the same error. One workaround is to enable
use_cache=True
forLlamaForCausalLM
. Concretely, you can edit this line from:self.llm_model = LlamaForCausalLM.from_pretrained(llm_model, torch_dtype=torch.float16)
to :
self.llm_model = LlamaForCausalLM.from_pretrained(llm_model, torch_dtype=torch.float16, use_cache=True)
thx, I met the same problem and fixed it by adding use_cache.