ms-swift icon indicating copy to clipboard operation
ms-swift copied to clipboard

swift框架使用lora微调Qwen3-Embedding-0.6B,lora merge后结果和没微调的时候一样

Open wongs19 opened this issue 4 months ago • 12 comments

1、微调脚本如下: torchrun --standalone --nnodes=1 --nproc-per-node=4
$(which swift) sft
--model pretrained_model/Qwen3-Embedding-0.6B
--task_type embedding
--model_type qwen3_emb
--train_type lora
--dataset /contrast_dataset/mini_train_110.jsonl
--val_dataset contrast_dataset/mini_val_110.jsonl
--save_strategy epoch
--eval_strategy epoch
--logging_steps 25
--output_dir output
--eval_steps 25
--save_steps 25
--num_train_epochs 10
--save_total_limit 3
--per_device_train_batch_size 64
--per_device_eval_batch_size 64
--gradient_accumulation_steps 4
--learning_rate 2e-5
--loss_type contrastive
--label_names labels
--dataloader_drop_last true
--dataloader_num_workers 8
--max_length 1024
--truncation_strategy left
--ddp_backend nccl 另外,在用swift微调时,数据集格式设为{query,responce,rejected_response}、loss_type=infonce时,eval会报错“eval_loss”的问题

wongs19 avatar Aug 19 '25 05:08 wongs19

能否给一下“eval会报错“eval_loss”的问题”的具体报错? 另外,merge后的结果和没微调一样,那不merge会有效果吗

tastelikefeet avatar Aug 20 '25 08:08 tastelikefeet

能否给一下“eval会报错“eval_loss”的问题”的具体报错? 另外,merge后的结果和没微调一样,那不merge会有效果吗

1、报错信息如下:[rank2]: Traceback (most recent call last): [rank2]: File "/opt/conda/envs/qwen3-py310_st/lib/python3.10/site-packages/swift/cli/sft.py", line 10, in [rank2]: sft_main() [rank2]: File "/opt/conda/envs/qwen3-py310_st/lib/python3.10/site-packages/swift/llm/train/sft.py", line 321, in sft_main [rank2]: return SwiftSft(args).main() [rank2]: File "/opt/conda/envs/qwen3-py310_st/lib/python3.10/site-packages/swift/llm/base.py", line 49, in main [rank2]: result = self.run() [rank2]: File "/opt/conda/envs/qwen3-py310_st/lib/python3.10/site-packages/swift/llm/train/sft.py", line 177, in run [rank2]: return self.train(trainer) [rank2]: File "/opt/conda/envs/qwen3-py310_st/lib/python3.10/site-packages/swift/llm/train/sft.py", line 225, in train [rank2]: trainer.train(trainer.args.resume_from_checkpoint) [rank2]: File "/opt/conda/envs/qwen3-py310_st/lib/python3.10/site-packages/swift/trainers/trainers.py", line 57, in train [rank2]: return super().train(*args, **kwargs) [rank2]: File "/opt/conda/envs/qwen3-py310_st/lib/python3.10/site-packages/swift/trainers/mixin.py", line 676, in train [rank2]: res = super().train(*args, **kwargs) [rank2]: File "/opt/conda/envs/qwen3-py310_st/lib/python3.10/site-packages/transformers/trainer.py", line 2238, in train [rank2]: return inner_training_loop( [rank2]: File "/opt/conda/envs/qwen3-py310_st/lib/python3.10/site-packages/transformers/trainer.py", line 2664, in _inner_training_loop [rank2]: self._maybe_log_save_evaluate( [rank2]: File "/opt/conda/envs/qwen3-py310_st/lib/python3.10/site-packages/swift/trainers/mixin.py", line 730, in _maybe_log_save_evaluate [rank2]: super()._maybe_log_save_evaluate(tr_loss, *args, **kwargs) [rank2]: File "/opt/conda/envs/qwen3-py310_st/lib/python3.10/site-packages/transformers/trainer.py", line 3138, in _maybe_log_save_evaluate [rank2]: is_new_best_metric = self._determine_best_metric(metrics=metrics, trial=trial) [rank2]: File "/opt/conda/envs/qwen3-py310_st/lib/python3.10/site-packages/transformers/trainer.py", line 3208, in _determine_best_metric [rank2]: raise KeyError( [rank2]: KeyError: "The metric_for_best_model training argument is set to 'eval_loss', which is not found in the evaluation metrics. The available evaluation metrics are: ['eval_runtime', 'eval_samples_per_second', 'eval_steps_per_second', 'epoch', 'global_step/max_steps', 'percentage', 'elapsed_time', 'remaining_time', 'memory(GiB)', 'train_speed(iter/s)']. Consider changing the metric_for_best_model via the TrainingArguments." [rank1]: Traceback (most recent call last): [rank1]: File "/opt/conda/envs/qwen3-py310_st/lib/python3.10/site-packages/transformers/trainer.py", line 3206, in _determine_best_metric [rank1]: metric_value = metrics[metric_to_check] [rank1]: KeyError: 'eval_loss'。

2、我直接使用transformer的形式:PeftModel.from_pretrained(base_model, lora_path),这种形式是不生效的。需要先通过swfit把lora权重合并,然后在transfomer中调用就是有差异的了。

wongs19 avatar Aug 20 '25 08:08 wongs19

第一个问题应该是eval数据过少的问题 第二个问题,如果直接使用我们的推理方法是不是生效的呢:https://github.com/modelscope/ms-swift/blob/main/examples/deploy/embedding/client.py

tastelikefeet avatar Aug 20 '25 08:08 tastelikefeet

第一个问题应该是eval数据过少的问题 第二个问题,如果直接使用我们的推理方法是不是生效的呢:https://github.com/modelscope/ms-swift/blob/main/examples/deploy/embedding/client.py

1、我有尝试在比较大的eval数据上跑,还是有这个问题。不确定您是否能复现。rejected_response的list长度需要对齐吗? 2、后面我试下你们的推理方法。感谢

wongs19 avatar Aug 20 '25 08:08 wongs19

@wongxin @tanklefleet

看你們的日誌與回覆,這裡其實同時踩到兩類典型問題(我們在 Problem Map 裡分開列;對應 No.14 Bootstrap Ordering + No.6 Config/Checkpoint Mismatch):

現象拆解

  1. merge 後結果像沒微調 多半是「載入路徑」與「評測/導出路徑」不一致,或 LoRA 沒真的 merge 到 base,最後推理仍走 base。
  2. eval 指標配置錯位 repo 的一些範例把 loss 當作固定 key,但你這個任務的 metrics 裡沒有 loss,需要改成實際存在的 key(如 accuracy, recall, mean_cosine, spearman 等)。

最短檢查清單(一次過)

  • 先在同一個環境下做最小可驗證對比,確認 merge 是否生效:
# 1) 直接載入 base 與 merged,對同一批樣本算餘弦
from sentence_transformers import SentenceTransformer, util

base   = SentenceTransformer("你的BASE路徑或huggingface倉庫")
merged = SentenceTransformer("你merge後導出的路徑")   # 確認不是LoRA adapter目錄

texts = ["hello world", "北京是中國的首都", "embedding check"]
emb_base   = base.encode(texts, normalize_embeddings=True)
emb_merged = merged.encode(texts, normalize_embeddings=True)

print("mean cosine(base vs merged):",
      float(util.cos_sim(emb_base, emb_merged).diagonal().mean()))
# 正常微調後,這個值不應該≈1.000;若≈1,代表你實際跑的還是base(merge沒生效或載錯目錄)
  • LoRA 正確合併

    • 確認用的是 合併後權重 的輸出路徑(不是 adapter 目錄)。
    • 需要 model.merge_and_unload() 或對應腳本把 LoRA 注入並導出;導出的目錄要與推理時 model_path 一致。
    • HuggingFace 推送/本地拷貝後,推理端不要再 load_adapter
  • 評測指標修正

    • loss 改成你實際計算得到的指標名;或直接關閉 eval,只做上面那段向量對比先證明 merge 生效。
  • 路徑與版本對齊

    • 檢查 save_dir / output_dir / eval_path / dataset_cache 是否混在多個分支或容器裡。
    • 同一張卡、同一 CUDA/Torch 版本先跑通,避免不同容器導致 cache/權重混讀。

常見踩雷

  • 只保存了 LoRA adapter 卻把它當成最終模型路徑去推理。
  • 重新啟動/換容器後,model_path 指到舊的 base;合併好的 ckpt 沒被讀到(No.14)。
  • eval 腳本的 metrics key 寫死(No.6),導致你以為「結果沒變」,其實是 eval 沒在跑正確指標。

如果需要,我可以貼出我們的 16 項完整檢查清單和逐步修復範例;也可以把你現在的 train/eval/merge/推理 的四段配置過一下,幫你定位到底是合併沒生效還是讀路徑錯。 參考:WFGY ProblemMap(含 No.14 / No.6 對應實操) https://github.com/onestardao/WFGY/blob/main/ProblemMap/README.md

要我給完整範例就回我一聲,我會補上最小可重現腳本和對應目錄結構。

onestardao avatar Aug 23 '25 04:08 onestardao

第一个问题应该是eval数据过少的问题 第二个问题,如果直接使用我们的推理方法是不是生效的呢:https://github.com/modelscope/ms-swift/blob/main/examples/deploy/embedding/client.py

1、我有尝试在比较大的eval数据上跑,还是有这个问题。不确定您是否能复现。rejected_response的list长度需要对齐吗? 2、后面我试下你们的推理方法。感谢

  1. rejected-reponse不需要长度对齐,这个之前有个bug已经修复了,使用最新的main分支或者最新的ms-swift包就可以
  2. 如果还有报错,可以再贴一下命令,我直接复现下

tastelikefeet avatar Aug 23 '25 04:08 tastelikefeet

第一个问题应该是eval数据过少的问题 第二个问题,如果直接使用我们的推理方法是不是生效的呢:https://github.com/modelscope/ms-swift/blob/main/examples/deploy/embedding/client.py

1、我有尝试在比较大的eval数据上跑,还是有这个问题。不确定您是否能复现。rejected_response的list长度需要对齐吗? 2、后面我试下你们的推理方法。感谢

  1. rejected-reponse不需要长度对齐,这个之前有个bug已经修复了,使用最新的main分支或者最新的ms-swift包就可以
  2. 如果还有报错,可以再贴一下命令,我直接复现下

版本:transformers 4.55.0 ms-swift 3.8.0.dev0 export INFONCE_TEMPERATURE=0.01 export INFONCE_USE_BATCH=False export INFONCE_MASK_FAKE_NEGATIVE=False

INFONCE_MASK_FAKE_NEGATIVE=true \

CUDA_VISIBLE_DEVICES=0,1,2,3
NPROC_PER_NODE=4
swift sft
--model pretrained_model/Qwen3-Embedding-0.6B
--task_type embedding
--model_type qwen3_emb
--train_type full
--dataset dataset/mini_train.jsonl
--val_dataset dataset/mini_val.jsonl
--eval_strategy steps
--lora_rank '32'
--output_dir infonce_output
--eval_steps 2
--num_train_epochs 1
--save_steps 70
--per_device_train_batch_size 2
--per_device_eval_batch_size 2
--gradient_accumulation_steps 2
--learning_rate 6e-6
--loss_type infonce
--label_names labels
--dataloader_drop_last true
--deepspeed zero3 这是我的执行命令,还是报eval_loss的问题

wongs19 avatar Aug 23 '25 04:08 wongs19

我这里还是没报错:

{'loss': 1.49420977, 'grad_norm': 532.29472268, 'learning_rate': 5.6e-06, 'epoch': 0.17, 'global_step/max_steps': '1/6', 'percentage': '16.67%', 'elapsed_time': '12s', 'remaining_time': '1m 3s', 'memory(GiB)': 3.89, 'train_speed(iter/s)': 0.078176}
Train:  33%|█████████████████████████████████████                                                                          | 2/6 [00:14<00:23,  5.99s/it]
{'eval_loss': 0.00115679, 'eval_margin': 0.07847409, 'eval_mean_neg': 0.66495824, 'eval_mean_pos': 0.77714515, 'eval_runtime': 4.8696, 'eval_samples_per_second': 20.535, 'eval_steps_per_second': 2.67, 'epoch': 0.33, 'global_step/max_steps': '2/6', 'percentage': '33.33%', 'elapsed_time': '18s', 'remaining_time': '37s', 'memory(GiB)': 3.89, 'train_speed(iter/s)': 0.105711}
Val: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:01<00:00,  6.17it/s]
Train:  67%|██████████████████████████████████████████████████████████████████████████                                     | 4/6 [00:21<00:08,  4.13s/it]
{'eval_loss': 2.12e-06, 'eval_margin': 0.16115864, 'eval_mean_neg': 0.58953881, 'eval_mean_pos': 0.79205102, 'eval_runtime': 2.1415, 'eval_samples_per_second': 46.696, 'eval_steps_per_second': 6.07, 'epoch': 0.67, 'global_step/max_steps': '4/6', 'percentage': '66.67%', 'elapsed_time': '23s', 'remaining_time': '11s', 'memory(GiB)': 3.89, 'train_speed(iter/s)': 0.170367}
Val: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:01<00:00,  6.41it/s]
{'loss': 0.08637756, 'grad_norm': 0.00295253, 'learning_rate': 4e-07, 'epoch': 0.83, 'global_step/max_steps': '5/6', 'percentage': '83.33%', 'elapsed_time': '24s', 'remaining_time': '4s', 'memory(GiB)': 3.89, 'train_speed(iter/s)': 0.202599}
Train: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:25<00:00,  2.97s/it]
{'eval_loss': 2.12e-06, 'eval_margin': 0.16314349, 'eval_mean_neg': 0.58731782, 'eval_mean_pos': 0.79442042, 'eval_runtime': 2.1584, 'eval_samples_per_second': 46.331, 'eval_steps_per_second': 6.023, 'epoch': 1.0, 'global_step/max_steps': '6/6', 'percentage': '100.00%', 'elapsed_time': '28s', 'remaining_time': '0s', 'memory(GiB)': 3.89, 'train_speed(iter/s)': 0.213447}
Val: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:01<00:00,  6.32it/s]
[INFO:swift] Saving model checkpoint to /mnt/nas3/yzhao/tastelikefeet/swift/infonce_output/v0-20250823-163051/checkpoint-6
{'train_runtime': 46.9096, 'train_samples_per_second': 2.132, 'train_steps_per_second': 0.128, 'train_loss': 0.30662052, 'epoch': 1.0, 'global_step/max_steps': '6/6', 'percentage': '100.00%', 'elapsed_time': '46s', 'remaining_time': '0s', 'memory(GiB)': 3.89, 'train_speed(iter/s)': 0.127884}
Train: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:46<00:00,  7.82s/it]
[INFO:swift] last_model_checkpoint: /mnt/nas3/yzhao/tastelikefeet/swift/infonce_output/v0-20250823-163051/checkpoint-6
[INFO:swift] best_model_checkpoint: None
[INFO:swift] images_dir: /mnt/nas3/yzhao/tastelikefeet/swift/infonce_output/v0-20250823-163051/images

命令:

export INFONCE_TEMPERATURE=0.01
export INFONCE_USE_BATCH=False
export INFONCE_MASK_FAKE_NEGATIVE=False

CUDA_VISIBLE_DEVICES=0,1,2,3 \
NPROC_PER_NODE=4 \
swift sft \
--model Qwen/Qwen3-Embedding-0.6B \
--task_type embedding \
--model_type qwen3_emb \
--train_type full \
--dataset test4.jsonl#100 \
--val_dataset test4.jsonl#100 \
--eval_strategy steps \
--lora_rank '32' \
--output_dir infonce_output \
--eval_steps 2 \
--num_train_epochs 1 \
--save_steps 70 \
--per_device_train_batch_size 2 \
--per_device_eval_batch_size 2 \
--gradient_accumulation_steps 2 \
--learning_rate 6e-6 \
--loss_type infonce \
--label_names labels \
--dataloader_drop_last true \
--deepspeed zero3

test4.jsonl:

{"query": "FFF", "response": "GGG", "rejected_response": ["HHH"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]}
{"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD"]}
{"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]}
{"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD"]}
{"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]}
{"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]}
{"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]}
{"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD"]}
{"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]}
{"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]}
{"query": "AAA", "response": "BBB", "rejected_response": ["DDD"]}

类似这样的格式

tastelikefeet avatar Aug 23 '25 08:08 tastelikefeet

我这里还是没报错:

{'loss': 1.49420977, 'grad_norm': 532.29472268, 'learning_rate': 5.6e-06, 'epoch': 0.17, 'global_step/max_steps': '1/6', 'percentage': '16.67%', 'elapsed_time': '12s', 'remaining_time': '1m 3s', 'memory(GiB)': 3.89, 'train_speed(iter/s)': 0.078176}
Train:  33%|█████████████████████████████████████                                                                          | 2/6 [00:14<00:23,  5.99s/it]
{'eval_loss': 0.00115679, 'eval_margin': 0.07847409, 'eval_mean_neg': 0.66495824, 'eval_mean_pos': 0.77714515, 'eval_runtime': 4.8696, 'eval_samples_per_second': 20.535, 'eval_steps_per_second': 2.67, 'epoch': 0.33, 'global_step/max_steps': '2/6', 'percentage': '33.33%', 'elapsed_time': '18s', 'remaining_time': '37s', 'memory(GiB)': 3.89, 'train_speed(iter/s)': 0.105711}
Val: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:01<00:00,  6.17it/s]
Train:  67%|██████████████████████████████████████████████████████████████████████████                                     | 4/6 [00:21<00:08,  4.13s/it]
{'eval_loss': 2.12e-06, 'eval_margin': 0.16115864, 'eval_mean_neg': 0.58953881, 'eval_mean_pos': 0.79205102, 'eval_runtime': 2.1415, 'eval_samples_per_second': 46.696, 'eval_steps_per_second': 6.07, 'epoch': 0.67, 'global_step/max_steps': '4/6', 'percentage': '66.67%', 'elapsed_time': '23s', 'remaining_time': '11s', 'memory(GiB)': 3.89, 'train_speed(iter/s)': 0.170367}
Val: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:01<00:00,  6.41it/s]
{'loss': 0.08637756, 'grad_norm': 0.00295253, 'learning_rate': 4e-07, 'epoch': 0.83, 'global_step/max_steps': '5/6', 'percentage': '83.33%', 'elapsed_time': '24s', 'remaining_time': '4s', 'memory(GiB)': 3.89, 'train_speed(iter/s)': 0.202599}
Train: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:25<00:00,  2.97s/it]
{'eval_loss': 2.12e-06, 'eval_margin': 0.16314349, 'eval_mean_neg': 0.58731782, 'eval_mean_pos': 0.79442042, 'eval_runtime': 2.1584, 'eval_samples_per_second': 46.331, 'eval_steps_per_second': 6.023, 'epoch': 1.0, 'global_step/max_steps': '6/6', 'percentage': '100.00%', 'elapsed_time': '28s', 'remaining_time': '0s', 'memory(GiB)': 3.89, 'train_speed(iter/s)': 0.213447}
Val: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:01<00:00,  6.32it/s]
[INFO:swift] Saving model checkpoint to /mnt/nas3/yzhao/tastelikefeet/swift/infonce_output/v0-20250823-163051/checkpoint-6
{'train_runtime': 46.9096, 'train_samples_per_second': 2.132, 'train_steps_per_second': 0.128, 'train_loss': 0.30662052, 'epoch': 1.0, 'global_step/max_steps': '6/6', 'percentage': '100.00%', 'elapsed_time': '46s', 'remaining_time': '0s', 'memory(GiB)': 3.89, 'train_speed(iter/s)': 0.127884}
Train: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:46<00:00,  7.82s/it]
[INFO:swift] last_model_checkpoint: /mnt/nas3/yzhao/tastelikefeet/swift/infonce_output/v0-20250823-163051/checkpoint-6
[INFO:swift] best_model_checkpoint: None
[INFO:swift] images_dir: /mnt/nas3/yzhao/tastelikefeet/swift/infonce_output/v0-20250823-163051/images

命令:

export INFONCE_TEMPERATURE=0.01 export INFONCE_USE_BATCH=False export INFONCE_MASK_FAKE_NEGATIVE=False

CUDA_VISIBLE_DEVICES=0,1,2,3
NPROC_PER_NODE=4
swift sft
--model Qwen/Qwen3-Embedding-0.6B
--task_type embedding
--model_type qwen3_emb
--train_type full
--dataset test4.jsonl#100
--val_dataset test4.jsonl#100
--eval_strategy steps
--lora_rank '32'
--output_dir infonce_output
--eval_steps 2
--num_train_epochs 1
--save_steps 70
--per_device_train_batch_size 2
--per_device_eval_batch_size 2
--gradient_accumulation_steps 2
--learning_rate 6e-6
--loss_type infonce
--label_names labels
--dataloader_drop_last true
--deepspeed zero3 test4.jsonl:

{"query": "FFF", "response": "GGG", "rejected_response": ["HHH"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]} {"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD"]} {"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]} {"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD"]} {"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]} {"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]} {"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]} {"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD"]} {"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD", "EEE"]} {"query": "FFF", "response": "GGG", "rejected_response": ["HHH", "III"]} {"query": "AAA", "response": "BBB", "rejected_response": ["DDD"]} 类似这样的格式

数据格式一样,可以发下您的环境,ms-swift、transformer的版本吗?应该是环境问题导致的

wongs19 avatar Aug 23 '25 08:08 wongs19

ms-swift是main分支,transformers是4.54.*

tastelikefeet avatar Aug 23 '25 08:08 tastelikefeet

ms-swift是main分支,transformers是4.54.*

感谢,我会把环境check下再尝试

wongs19 avatar Aug 23 '25 08:08 wongs19

max_length: 限制单数据集样本经过tokenizer.encode后的tokens最大长度,超过的数据样本会根据truncation_strategy参数进行处理(避免训练OOM)。默认为None,即设置为模型支持的tokens最大长度(max_model_len)。

当PPO、GRPO和推理情况下,max_length代表max_prompt_length。

truncation_strategy: 如果单样本的tokens超过max_length如何处理,支持delete、left和right,代表删除、左侧裁剪和右侧裁剪,默认为'delete'。

有可能超长的被 delete 导出valid 的数据不足

liwei6677 avatar Nov 13 '25 12:11 liwei6677