The shape in current model is torch.Size([0])

Open richardburleigh opened this issue 2 years ago • 1 comments

Thank you for this amazing project!

I'm having trouble loading the model when attempting to finetune, both in lora and adapters.

File "finetune/lora.py", line 106, in main
    model.load_state_dict(checkpoint, strict=False)
  File ".local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2056, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Parrot:

size mismatch for lm_head.weight: copying a param with shape torch.Size([50688, 4096]) from checkpoint, the shape in current model is torch.Size([0]).

size mismatch for transformer.wte.weight: copying a param with shape torch.Size([50688, 4096]) from checkpoint, the shape in current model is torch.Size([0]).

size mismatch for transformer.h.0.norm_1.weight: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([0]).

size mismatch for transformer.h.0.norm_1.bias: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([0]).

...

I am using stablelm-base-alpha-3b and received the same errors on falcon.

However, generate/base.py is working perfectly in loading the model and generating a response.

Any ideas would be appreciated!

Jun 16 '23 02:06 richardburleigh

Looks like an issue with the model instantiation. Can you pull main and call scripts/convert_hf_checkpoint.py again?

Jun 16 '23 03:06 carmocca