fastllm PEFT LOAR Segmentation fault (core dumped)

CHATGLM2 LORA微调 model = AutoModel.from_pretrained(model_path, device_map='cpu', trust_remote_code=True) model = PeftModel.from_pretrained(model, ckpt_path) # 这里使用你自己的peft adapter model = model.eval() tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) from fastllm_pytools import llm model = llm.from_hf(model, tokenizer, dtype = "float16") 显示； Segmentation fault (core dumped) 求教为啥、

Dec 04 '23 08:12 chestnut111

它项目加载adapter的时候，是按照层名称去加载的，但层名称它现在是写死的，但是peft版本更新之后，peftmodel的结构名称有了变化，导致项目没法加载到参数，在运行warmup的时候报错，最终导致Segmentation fault (core dumped)。我的解决方法是，在它项目里面fastllm /src/models/chatglm.cpp中， std::string qkvWeightName = weightPre + std::to_string(i) + weightMiddle + ".query_key_value.weight"; std::string qkvBiasName = weightPre + std::to_string(i) + weightMiddle + ".query_key_value.bias";

std::string loraAWeightName = weightPre + std::to_string(i) + weightMiddle + ".query_key_value.lora_A." + adapterName + ".weight"; std::string loraBWeightName = weightPre + std::to_string(i) + weightMiddle + ".query_key_value.lora_B." + adapterName + ".weight"; 这几行改回对应结构的名字就行了。例如我自己是qkv有问题，改成 std::string qkvWeightName = weightPre + std::to_string(i) + weightMiddle + ".query_key_value.base_layer.weight"; std::string qkvBiasName = weightPre + std::to_string(i) + weightMiddle + ".query_key_value.base_layer.bias"; 就可以了。

Jan 30 '24 10:01 zzykira

CHATGLM2 LORA微调 model = AutoModel.from_pretrained(model_path, device_map='cpu', trust_remote_code=True) model = PeftModel.from_pretrained(model, ckpt_path) # 这里使用你自己的peft adapter model = model.eval() tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) from fastllm_pytools import llm model = llm.from_hf(model, tokenizer, dtype = "float16") 显示； Segmentation fault (core dumped) 求教为啥、

使用chatglm3进行lora微调，然后使用fastllm加速，遇到了同样的问题

Mar 08 '24 07:03 pingyuan2016

CHATGLM2 LORA微调 model = AutoModel.from_pretrained(model_path, device_map='cpu', trust_remote_code=True) model = PeftModel.from_pretrained(model, ckpt_path) # 这里使用你自己的peft adapter model = model.eval() tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) from fastllm_pytools import llm model = llm.from_hf(model, tokenizer, dtype = "float16") 显示； Segmentation fault (core dumped) 求教为啥、

使用chatglm3进行lora微调，然后使用fastllm加速，遇到了同样的问题

chatglm3我没试过，不过我自己试过chatglm2的加速效果真不太行，加了lora的效果更差了。

Mar 19 '24 05:03 zzykira