llama.cpp Issue with convert-lora-to-ggml.py

Hi!

I am currently using the hugging face sfttrainer to fine-tune my own model. I am saving the model to weights and bias then downloading adapter weights to my computer to us llama.cpp/convert-lora-to-ggml.py to convert the files into a format that I can load into llama.cpp.

This is the command I am using for training

trainer = SFTTrainer(
    model=model,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    peft_config=config,
    dataset_text_field="text",
    max_seq_length=None,  # You can specify the maximum sequence length here
    tokenizer=tokenizer,
    args=training_arguments,
    packing=packing,
)

below is the error I am getting

Error: unrecognized tensor name base_model.model.lm_head.base_layer.weight

This is the first three layers when I print out model.keys() in the script

'base_model.model.lm_head.base_layer.weight', 'base_model.model.lm_head.lora_A.weight', 'base_model.model.lm_head.lora_B.weight',

...
with open(output_path, "wb") as fout:
        fout.truncate()

        write_file_header(fout, params)

        print(model.keys()) # added this line to get the print out
        for k, v in model.items():
            orig_k = k
            if k.endswith(".default.weight"):
...

I don't wish to hand out the files that I trained as they contain information I don't wish to share.

Feb 20 '24 01:02 Gunnar-Stunnar

@Gunnar-Stunnar I am running into a similar error with the conversion script when trying to convert a lora from the StableLM arch derived model. I'll update if I can find a fix.

Feb 21 '24 16:02 ndavidson19

Same with me! But I found a workaround.

I've tried mistralai/Mixtral-8x7B-Instruct-v0.1 and NousResearch/Nous-Hermes-2-Yi-34B (both threw out this warning if it helps: /home/vic/.local/lib/python3.10/site-packages/peft/utils/save_and_load.py:160: UserWarning: Setting save_embedding_layerstoTrue as the embedding layer has been resized during finetuning.)

I tried it w/o safetensors:

trainer.model.save_pretrained(finetuned_model_name, safe_serialization=False, from_pt=True)

and it works!

(still got this error but it's not a blocker: Error: unrecognized tensor name base_model.model.model.embed_tokens.weight)

Feb 23 '24 14:02 drummerv

I just converted Gemma LoRA adapter weights to ggml. How can one use the LoRA adapter weights to make inferences using Llama.cpp? Screenshot from 2024-02-27 00-36-53

Feb 26 '24 21:02 Ronnie-Leon76

Update: Haven't yet deployed it to see full results but I found if I include a line in the script to step over that layer it works once I use the output in llama.cpp

Feb 28 '24 01:02 Gunnar-Stunnar

i have the same problem with mixtral moe lora

Mar 02 '24 07:03 poohzaza166

you can get rid of errors with 'base_model.model.lm_head.lora_A.weight', 'base_model.model.lm_head.lora_B.weight' by another closed issue. I am still trying to figure out how to address the error with base_model.model.lm_head.base_layer.weight, will update once addressed.

Apr 12 '24 13:04 Yuejun-GUO

Bump, having the same problem now. I think the crux of this issue is when you add special tokens while training the adapter. The vocab size is increased and the fc layer at the end which was dim (vocab_size) is now (vocab_size+len([special_tokens]). This is then included within the lora .safetensors file...

I can bypass this if I force the conversion script to pass over these layers... but I think that effects model performance

Apr 13 '24 04:04 farris

This issue was closed because it has been inactive for 14 days since being marked as stale.

May 28 '24 02:05 github-actions[bot]

llama.cpp llama.cpp copied to clipboard

Issue with convert-lora-to-ggml.py

llama.cpp
llama.cpp copied to clipboard