size mismatch when finetuning Falcon

Open Lumingous opened this issue 2 years ago • 1 comments

Hi, I'm trying to finetune falcon-7B with lora, however, I've got this error:

RuntimeError: Error(s) in loading state_dict for GPT:
size mismatch for lm_head.weight: copying a param with shape torch.Size([70144, 4544]) from checkpoint, the shape in current model is torch.Size([65024, 4544]).
size mismatch for transformer.wte.weight: copying a param with shape torch.Size([70144, 4544]) from checkpoint, the shape in current model is torch.Size([65024, 4544]).

Has anyone else encountered this error? If so, how can I solve this? Thanks a lot!

Jul 11 '23 07:07 Lumingous

I just ran the steps and it worked. Did you call python scripts/prepare_alpaca.py --checkpoint_dir checkpoints/tiiuae/falcon-7b? Alpaca needs to be process with the specific model tokenizer. Can you describe which steps you followed?

Jul 11 '23 14:07 carmocca