Qwen2.5 Qwen1.5 合并

Qwen1.5 合并

Open GXKIM opened this issue 11 months ago • 1 comments

如题代码：

from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
import argparse

parser=argparse.ArgumentParser()
parser.add_argument("--path_to_adapter",help="path to adapter")
parser.add_argument("--merge_path",help="path to merge")
args=parser.parse_args()

#new_model_directory='merge_qwen'
#path_to_adapter="output_qwen"
model = AutoPeftModelForCausalLM.from_pretrained(
    args.path_to_adapter,
    device_map="auto",
    trust_remote_code=True
).eval()

merged_model = model.merge_and_unload()
# max_shard_size and safe serialization are not necessary.
# They respectively work for sharding checkpoint and save the model to safetensors
merged_model.save_pretrained(args.merge_path, max_shard_size="2048MB", safe_serialization=True)


tokenizer = AutoTokenizer.from_pretrained(
    args.path_to_adapter, # path to the output directory
    trust_remote_code=True
)

tokenizer.save_pretrained(args.merge_path)

CUDA_VISIBLE_DEVICES=0 python merge.py --path_to_adapter /data/ai/user/Qwen1.5-14B-Chat_sft_checkpoint --merge_path /data/ai/user/merge_test

Qwen1 是正常的

报错如下：NotImplementedError: Cannot copy out of meta tensor; no data!

Mar 25 '24 06:03 GXKIM

please check model.hf_device_map and see if there are parameters on CPU (the meta device). if there are such parameters, you need to force them on the target device: change auto to cpu or provide a custom device_map mapping parameters to your GPU devices. it happens if there is not enough available GPU memory (for example, because other processes occupying the GPUs) or accelerate miscalculates the needed VRAM.

however, this issue should have been addressed in peft>=0.8.0. if your version of peft is over 0.8.0, it is advised to report to peft.

P.S.: what is UDAU_VISIBLE_DEVICES=0 supposed to do? is UDAU a special kind of devices?

Mar 25 '24 12:03 jklj077

Qwen2.5 Qwen2.5 copied to clipboard

Qwen1.5 合并

Qwen2.5
Qwen2.5 copied to clipboard