llama.cpp
llama.cpp copied to clipboard
Issue with convert-lora-to-ggml.py
Hi!
I am currently using the hugging face sfttrainer to fine-tune my own model. I am saving the model to weights and bias then downloading adapter weights to my computer to us llama.cpp/convert-lora-to-ggml.py to convert the files into a format that I can load into llama.cpp.
This is the command I am using for training
trainer = SFTTrainer(
model=model,
train_dataset=train_dataset,
eval_dataset=test_dataset,
peft_config=config,
dataset_text_field="text",
max_seq_length=None, # You can specify the maximum sequence length here
tokenizer=tokenizer,
args=training_arguments,
packing=packing,
)
below is the error I am getting
Error: unrecognized tensor name base_model.model.lm_head.base_layer.weight
This is the first three layers when I print out model.keys() in the script
'base_model.model.lm_head.base_layer.weight', 'base_model.model.lm_head.lora_A.weight', 'base_model.model.lm_head.lora_B.weight',
...
with open(output_path, "wb") as fout:
fout.truncate()
write_file_header(fout, params)
print(model.keys()) # added this line to get the print out
for k, v in model.items():
orig_k = k
if k.endswith(".default.weight"):
...
I don't wish to hand out the files that I trained as they contain information I don't wish to share.
@Gunnar-Stunnar I am running into a similar error with the conversion script when trying to convert a lora from the StableLM arch derived model. I'll update if I can find a fix.
Same with me! But I found a workaround.
I've tried mistralai/Mixtral-8x7B-Instruct-v0.1
and NousResearch/Nous-Hermes-2-Yi-34B
(both threw out this warning if it helps: /home/vic/.local/lib/python3.10/site-packages/peft/utils/save_and_load.py:160: UserWarning: Setting
save_embedding_layersto
True as the embedding layer has been resized during finetuning.
)
I tried it w/o safetensors:
trainer.model.save_pretrained(finetuned_model_name, safe_serialization=False, from_pt=True)
and it works!
(still got this error but it's not a blocker: Error: unrecognized tensor name base_model.model.model.embed_tokens.weight
)
I just converted Gemma LoRA adapter weights to ggml. How can one use the LoRA adapter weights to make inferences using Llama.cpp?
Update: Haven't yet deployed it to see full results but I found if I include a line in the script to step over that layer it works once I use the output in llama.cpp
i have the same problem with mixtral moe lora
you can get rid of errors with 'base_model.model.lm_head.lora_A.weight', 'base_model.model.lm_head.lora_B.weight'
by another closed issue. I am still trying to figure out how to address the error with base_model.model.lm_head.base_layer.weight
, will update once addressed.
Bump, having the same problem now.
I think the crux of this issue is when you add special tokens while training the adapter. The vocab size is increased and the fc layer at the end which was dim (vocab_size) is now (vocab_size
+len([special_tokens]
). This is then included within the lora .safetensors file...
I can bypass this if I force the conversion script to pass over these layers... but I think that effects model performance
This issue was closed because it has been inactive for 14 days since being marked as stale.