lora微调gte embedding, merge后推理结果跟微调的结果相差很大
lora微调gte embedding, 使用merge后的模型进行推理,结果跟微调的结果相差很大,甚至比初始模型效果还差
shell
swift sft \
--model 'iic/gte_Qwen2-1.5B-instruct' \
--train_type lora \
--dataset '/workspace/train_df.csv' \
--val_dataset '/workspace/test_df.csv' \
--torch_dtype bfloat16 \
--num_train_epochs 3 \
--per_device_train_batch_size 4 \
--per_device_eval_batch_size 4 \
--gradient_accumulation_steps 16 \
--eval_steps 100 \
--save_steps 100 \
--eval_strategy steps \
--use_chat_template false \
--save_total_limit 2 \
--logging_steps 5 \
--output_dir output \
--warmup_ratio 0.05 \
--learning_rate 5e-6 \
--deepspeed zero3 \
--dataloader_num_workers 4 \
--task_type embedding \
--loss_type cosine_similarity \
--dataloader_drop_last true
merge
swift export \
--adapters /workspace/output/v1/checkpoint-800 \
--merge_lora true
合并后使用SentenceTransformer进行推理
from sentence_transformers import SentenceTransformer
# model = SentenceTransformer("iic/gte_Qwen2-1.5B-instruct", trust_remote_code=True)
model = SentenceTransformer("/workspace/output/v1/checkpoint-800-merged", trust_remote_code=True)
# In case you want to reduce the maximum length:
model.max_seq_length = 8192
queries = [...]
documents = [...]
query_embeddings = model.encode(queries, prompt_name="query")
document_embeddings = model.encode(documents)
scores = (query_embeddings @ document_embeddings.T)
使用iic/gte_Qwen2-1.5B-instruct模型,对测试集前十条数据计算得到的分数
使用lora微调iic/gte_Qwen2-1.5B-instruc后合并的模型,对测试集前十条数据计算得到的分数
哈喽,我微调自己的数据,也变差了,你找到原因了嘛。而且我发现他这个显存占用特别高
哈喽,我微调自己的数据,也变差了,你找到原因了嘛。而且我发现他这个显存占用特别高
我参考swift的文档用python直接加载lora checkpoint对验证集进行推理,能得到微调最后一致的eval_loss,说明lora checkpoint是正常的,应该是使用swift export进行merge_lora时出了问题。以下是我加载lora checkpoint以及merge的代码,希望能帮到你
from swift.llm import (
PtEngine, RequestConfig, safe_snapshot_download, get_model_tokenizer, get_template, InferRequest
)
from swift.tuners import Swift
from swift.utils import copy_files_by_pattern
model_dir = './base_model'
lora_checkpoint_dir = './outputs/v0-lora-checkpoint' # 修改成checkpoint_dir
# Load model
model, tokenizer = get_model_tokenizer(model_dir)
model = Swift.from_pretrained(model_dir, lora_checkpoint_dir)
# Merge LoRA weights into the base model
model = model.merge_and_unload() # This fuses LoRA into the base weights
# Save model
output_dir = './outputs/v0-lora-checkpoint-merged'
model.save_pretrained(output_dir, safe_serialization=True)
# Copy SentenceTransformers files
copy_files_by_pattern(model_dir, output_dir, '*.py')
copy_files_by_pattern(model_dir, output_dir, '*.json')
后续重新加载自己merge后的模型进行测试,能得到与使用lora checkpoint进行推理一致的效果
from sentence_transformers import SentenceTransformer
model_dir = './outputs/v0-lora-checkpoint-merged'
model = SentenceTransformer(model_dir)
大家遇到的问题我复现不了,我先使用iic/gte_Qwen2-1.5B-instruct和sentence-transformers/stsb进行了训练:
{
"name": "embedding_nlp",
"type": "python",
"request": "launch",
"program": "swift/cli/sft.py",
"console": "integratedTerminal",
"justMyCode": false,
"env": {
"CUDA_VISIBLE_DEVICES": "2",
"PYTHONPATH": ".",
},
"args": [
"--model", "iic/gte_Qwen2-1.5B-instruct",
"--task_type", "embedding",
"--train_type", "lora",
"--dataset", "sentence-transformers/stsb",
"--split_dataset_ratio", "0.05",
"--eval_strategy", "steps",
"--output_dir", "output",
"--eval_steps", "1000",
"--save_steps", "1000",
"--per_device_train_batch_size", "2",
"--per_device_eval_batch_size", "2",
"--gradient_accumulation_steps", "16",
"--learning_rate", "6e-6",
"--loss_type", "cosine_similarity",
"--label_names", "labels",
"--dataloader_drop_last", "true",
]
},
之后分别使用swift export和上面的代码分别进行了合并,最后发现官方例子:
https://www.modelscope.cn/models/iic/gte_Qwen2-1.5B-instruct
得到的数值是一样的
# swift export
[[54.58847427368164, 17.177675247192383], [15.789773941040039, 51.380191802978516]]
# code here
[[54.58847427368164, 17.177675247192383], [15.789773941040039, 51.380191802978516]]
This issue has been inactive for over 3 months and will be automatically closed in 7 days. If this issue is still relevant, please reply to this message.
This issue has been automatically closed due to inactivity. If needed, it can be reopened.