LongLoRA icon indicating copy to clipboard operation
LongLoRA copied to clipboard

merge_lora_weights_and_save_hf_model.py Error while deserializing header: HeaderTooLarge

Open Spongeorge opened this issue 1 year ago • 0 comments

Trying to run the following:

python /projects/geba2844/LongLora/LongLoRA/merge_lora_weights_and_save_hf_model.py \
        --base_model /projects/geba2844/LongLora/Llama-2-7b-hf \
        --peft_model /projects/geba2844/LongLora/Llama-2-7b-longlora-8k \
        --context_size 8192 \
        --save_path /projects/geba2844/LongLora/Llama-2-7b-longlora-8k-hf

Returns the following error:

bash-4.4$ bash merge.sh
base model /projects/geba2844/LongLora/Llama-2-7b-hf
peft model /projects/geba2844/LongLora/Llama-2-7b-longlora-8k
Loading checkpoint shards:   0%|                                                                                                                                                                                       | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/projects/geba2844/LongLora/LongLoRA/merge_lora_weights_and_save_hf_model.py", line 113, in <module>
    main(args)
  File "/projects/geba2844/LongLora/LongLoRA/merge_lora_weights_and_save_hf_model.py", line 68, in main
    model = transformers.AutoModelForCausalLM.from_pretrained(
  File "/projects/geba2844/software/anaconda/envs/longlorabuild/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 565, in from_pretrained
    return model_class.from_pretrained(
  File "/projects/geba2844/software/anaconda/envs/longlorabuild/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3307, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/projects/geba2844/software/anaconda/envs/longlorabuild/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3681, in _load_pretrained_model
    state_dict = load_state_dict(shard_file)
  File "/projects/geba2844/software/anaconda/envs/longlorabuild/lib/python3.10/site-packages/transformers/modeling_utils.py", line 463, in load_state_dict
    with safe_open(checkpoint_file, framework="pt") as f:
safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

Spongeorge avatar Jan 23 '24 20:01 Spongeorge