fastllm icon indicating copy to clipboard operation
fastllm copied to clipboard

PEFT LOAR Segmentation fault (core dumped)

Open chestnut111 opened this issue 2 years ago • 3 comments

CHATGLM2 LORA微调 model = AutoModel.from_pretrained(model_path, device_map='cpu', trust_remote_code=True) model = PeftModel.from_pretrained(model, ckpt_path) # 这里使用你自己的peft adapter model = model.eval() tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) from fastllm_pytools import llm model = llm.from_hf(model, tokenizer, dtype = "float16") 显示; Segmentation fault (core dumped) 求教为啥、

chestnut111 avatar Dec 04 '23 08:12 chestnut111

它项目加载adapter的时候,是按照层名称去加载的,但层名称它现在是写死的,但是peft版本更新之后,peftmodel的结构名称有了变化,导致项目没法加载到参数,在运行warmup的时候报错,最终导致Segmentation fault (core dumped)。 我的解决方法是,在它项目里面fastllm /src/models/chatglm.cpp中, std::string qkvWeightName = weightPre + std::to_string(i) + weightMiddle + ".query_key_value.weight"; std::string qkvBiasName = weightPre + std::to_string(i) + weightMiddle + ".query_key_value.bias";

std::string loraAWeightName = weightPre + std::to_string(i) + weightMiddle + ".query_key_value.lora_A." + adapterName + ".weight"; std::string loraBWeightName = weightPre + std::to_string(i) + weightMiddle + ".query_key_value.lora_B." + adapterName + ".weight"; 这几行改回对应结构的名字就行了。例如我自己是qkv有问题,改成 std::string qkvWeightName = weightPre + std::to_string(i) + weightMiddle + ".query_key_value.base_layer.weight"; std::string qkvBiasName = weightPre + std::to_string(i) + weightMiddle + ".query_key_value.base_layer.bias"; 就可以了。

zzykira avatar Jan 30 '24 10:01 zzykira

CHATGLM2 LORA微调 model = AutoModel.from_pretrained(model_path, device_map='cpu', trust_remote_code=True) model = PeftModel.from_pretrained(model, ckpt_path) # 这里使用你自己的peft adapter model = model.eval() tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) from fastllm_pytools import llm model = llm.from_hf(model, tokenizer, dtype = "float16") 显示; Segmentation fault (core dumped) 求教为啥、

使用chatglm3进行lora微调,然后使用fastllm加速,遇到了同样的问题

pingyuan2016 avatar Mar 08 '24 07:03 pingyuan2016

CHATGLM2 LORA微调 model = AutoModel.from_pretrained(model_path, device_map='cpu', trust_remote_code=True) model = PeftModel.from_pretrained(model, ckpt_path) # 这里使用你自己的peft adapter model = model.eval() tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) from fastllm_pytools import llm model = llm.from_hf(model, tokenizer, dtype = "float16") 显示; Segmentation fault (core dumped) 求教为啥、

使用chatglm3进行lora微调,然后使用fastllm加速,遇到了同样的问题

chatglm3我没试过,不过我自己试过chatglm2的加速效果真不太行,加了lora的效果更差了。

zzykira avatar Mar 19 '24 05:03 zzykira