FastChat RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0

RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0

Open LetsGoFir opened this issue 2 years ago • 8 comments

fastchat==0.2.9 transformers is also the newest (commit a2789addd) vicuna-13b-delta-v1.1

May 18 '23 02:05 LetsGoFir

could you check for this? @merrymercy

May 18 '23 06:05 LetsGoFir

@LetsGoFir play around the gradient_accumulation_steps. instead of 16 try smaller steps. also see issue 540

May 18 '23 21:05 ZohaibDurrani

What is the command you use? Could you also try cleaning your hugging face cache by rm -rf ~/.cache/huggingface?

May 20 '23 13:05 merrymercy

Thanks for your reply! Let me try

May 22 '23 09:05 LetsGoFir

What is the command you use? Could you also try cleaning your hugging face cache by rm -rf ~/.cache/huggingface?

Still mismatch dimention, and this is my command @merrymercy

python3 -m fastchat.model.apply_delta \
    --base-model-path llama-13b-hf/ \
    --target-model-path vicuna-13b-v1-1 \
    --delta-path vicuna-13b-delta-v1.1

Jun 01 '23 03:06 LetsGoFir

@LetsGoFir play around the gradient_accumulation_steps. instead of 16 try smaller steps. also see issue 540

I am trying to get vicuna weights, not finetuning

Jun 01 '23 03:06 LetsGoFir

Do you fix this bug? I have met this problem tooooo. orz

Jun 04 '23 15:06 FatCatPlus

Do you fix this bug? I have met this problem tooooo. orz

ok, I worked out. I guess the reason for this problem is because you are using a different version of convert_llama_weights_to_hf.py than your Transformer version.

Jun 04 '23 16:06 FatCatPlus

Do you fix this bug? I have met this problem tooooo. orz

ok, I worked out. I guess the reason for this problem is because you are using a different version of convert_llama_weights_to_hf.py than your Transformer version.

Hello, could you please offer the right version of convert_llama_weights_to_hf.py and your Transformer version to me? Thx

Jul 05 '23 07:07 lzk9508

the ture reason is that the size of dataset is not divisible by batchsize，just add a parameter "--dataloader_drop_last True"

Jul 24 '23 02:07 shleo

@LetsGoFir did you solve it? Seems like many suggestions here.

I will close this one, as it seems that the suggestions are helpful enough, and it's not a bug of FastChat, but please reopen it if you feel that we need to look into it further.

Oct 23 '23 09:10 surak

FastChat FastChat copied to clipboard

RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0

FastChat
FastChat copied to clipboard