Hugging Face from_pretrained() using merged weights KeyError: 'base_model_name_or_path'
test codes in https://pytorch.org/torchtune/stable/tutorials/e2e_flow.html#use-with-hugging-face-from-pretrained
from transformers import AutoModelForCausalLM, AutoTokenizer
import transformers
print(transformers.__version__)
#TODO: update it to your chosen epoch
trained_model_path = "models/torchtune/llama3_2_3B/lora_single_device/epoch_1"
# trained_model_path = "/home/cine/Documents/tune/models/Llama-3.2-3B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
pretrained_model_name_or_path=trained_model_path,
)
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(trained_model_path, safetensors=True)
# Function to generate text
def generate_text(model, tokenizer, prompt, max_length=50):
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=max_length)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
prompt = "tell me a joke"
print("Base model output:", generate_text(model, tokenizer, prompt))
prompt = "Complete the sentence: 'Once upon a time..."
print("Base model output:", generate_text(model, tokenizer, prompt))
error
(base) cine@20211029-a04:~/Documents/tune$ /home/cine/miniconda3/envs/tune/bin/python /home/cine/Documents/tune/gen_from_merged_sft.py
Traceback (most recent call last):
File "/home/cine/Documents/tune/gen_from_merged_sft.py", line 7, in <module>
model = AutoModelForCausalLM.from_pretrained(
File "/home/cine/miniconda3/envs/tune/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 514, in from_pretrained
pretrained_model_name_or_path = adapter_config["base_model_name_or_path"]
KeyError: 'base_model_name_or_path'
but I can use peft to load the sftr model with
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
#TODO: update it to your chosen epoch
trained_model_path = "models/torchtune/llama3_2_3B/lora_single_device/epoch_1"
# Define the model and adapter paths
# # To Avoid this error, we can use local model
original_model_name = '/home/cine/Documents/tune/models/Llama-3.2-3B-Instruct'
model = AutoModelForCausalLM.from_pretrained(original_model_name)
# huggingface will look for adapter_model.safetensors and adapter_config.json
peft_model = PeftModel.from_pretrained(model, trained_model_path)
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(original_model_name)
# Function to generate text
def generate_text(model, tokenizer, prompt, max_length=50):
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=max_length)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
prompt = "tell me a joke: '"
print("Base model output:", generate_text(peft_model, tokenizer, prompt))
huggingface may be prioritizing reading from "adapter_config.json", instead of reading the model config. Maybe when i tested it, I tried it with full finetuning, instead of lora.
One sanity check is to remove or move adapter_model.safetensors and adapter_config.json files, to see if it defaults to the full model. I am on PTO this week, but i can look into it next week.
@chg0901 I'm not able to reproduce the error. For me it seems to work just fine. I might be missing something; can you please help me reproduce it?
ok, but how? I will try my best to assist you if you could specify what should I do
Ankur Singh @.***> 于 2025年1月13日周一 10:14写道:
@chg0901 https://github.com/chg0901 I'm not able to reproduce the error. For me it seems to work just fine. I might be missing something; can you please help me reproduce it?
— Reply to this email directly, view it on GitHub https://github.com/pytorch/torchtune/issues/2224#issuecomment-2586016794, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB636WF27O7YNNWYBUSBZR32KMHRBAVCNFSM6AAAAABUPGQJ3WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOBWGAYTMNZZGQ . You are receiving this because you were mentioned.Message ID: @.***>
Will it be possible to share a colab notebook with all the code to reproduce the error?
https://github.com/chg0901/hands_on_torchtune
Please check this repo
The blog is written in Chinese, but I think maybe you could use translator to read it.
Have a good day
Ankur Singh @.***> 于 2025年1月13日周一 22:58写道:
Will it be possible to share a colab notebook with all the code to reproduce the error?
— Reply to this email directly, view it on GitHub https://github.com/pytorch/torchtune/issues/2224#issuecomment-2587177078, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB636WGEGLNH757CO33XWAL2KPBATAVCNFSM6AAAAABUPGQJ3WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOBXGE3TOMBXHA . You are receiving this because you were mentioned.Message ID: @.***>
I had the same issue, and @felipemello1's answer worked for me perfectly.