grpo训练卡住,一直显示一下问题。
训练脚本如下:NPROC_PER_NODE=3
CUDA_VISIBLE_DEVICES=4,5,6,7
swift rlhf
--rlhf_type grpo
--model /home/data3/zys/lzr/model/model_output/cpt/v40-20250226-082956/checkpoint-6378
--adapters /home/data3/zys/lzr/model/model_output/sft/v119-20250327-203143/checkpoint-6740
--external_plugins /home/zhangyusi/prj/swift/examples/train/grpo/plugin/plugin_counting.py
--reward_funcs acc format
--use_vllm true
--vllm_device auto
--vllm_gpu_memory_utilization 0.7
--vllm_max_model_len 4096
--train_type lora
--torch_dtype bfloat16
--dataset '/home/data3/zys/lzr/SFT/GRPO/data_grpo/grpo_counting.json'
--max_completion_length 2048
--num_train_epochs 1
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--learning_rate 1e-6
--gradient_accumulation_steps 2
--eval_steps 100
--save_steps 100
--save_total_limit 2
--logging_steps 5
--max_length 2048
--output_dir /home/data3/zys/lzr/model/model_output/grpo
--warmup_ratio 0.05
--dataloader_num_workers 4
--dataset_num_proc 4
--num_generations 3
--temperature 0.9
--deepspeed zero3
--beta 0.001
--system '/home/zhangyusi/prj/swift/examples/train/grpo/prompt.txt'
--log_completions True
求好心人指点帮助,非常非常感谢,不知道是哪里参数设置有问题。
一直显示这个界面很久,然后就爆超时的错了
求助
Async mode can sometimes hang - please try SWIFT 3.2.2 or colocate mode instead.
在尝试部分rlhf算法时也是训练一开始卡住,等很长时间后爆这个问题;之前跑了一遍所有rlhf算法从未遇到这个问题,换了一个数据集后突然卡住了,然后再换回原数据集依然卡住,而且部分算法稳定的不卡,部分稳定的卡,看起来没有规律
你vllm什么版本?我之前用0.7.2的时候也会在这里卡住,更新到0.7.3就没事了
@1245244103 能跑了但是爆了这些问题,有影响吗
deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
Train: 1%|▋ | 4/412 [03:14<5:42:54, 50.43s/it]INFO 04-27 12:04:57 prefix_caching_block.py:479] Successfully reset prefix cache
INFO 04-27 12:04:57 prefix_caching_block.py:479] Successfully reset prefix cache
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
equations is deprecated, as it handled by the parser now
Error during comparison
Traceback (most recent call last):
File "/root/miniconda3/envs/ansen_swift/lib/python3.10/site-packages/math_verify/grader.py", line 809, in compare_single_extraction_wrapper
return compare_single_extraction(g, t)
File "/root/miniconda3/envs/ansen_swift/lib/python3.10/site-packages/math_verify/utils.py", line 51, in wrapper
return func(*args, **kwargs)
File "/root/miniconda3/envs/ansen_swift/lib/python3.10/site-packages/math_verify/grader.py", line 789, in compare_single_extraction
return sympy_expr_eq(
File "/root/miniconda3/envs/ansen_swift/lib/python3.10/site-packages/math_verify/grader.py", line 670, in sympy_expr_eq
return sympy_compare_sets(gold, pred, float_rounding, numeric_precision)
File "/root/miniconda3/envs/ansen_swift/lib/python3.10/site-packages/math_verify/grader.py", line 413, in sympy_compare_sets
return sympy_deep_compare_set_and_tuple(
File "/root/miniconda3/envs/ansen_swift/lib/python3.10/site-packages/math_verify/grader.py", line 242, in sympy_deep_compare_set_and_tuple
return all(
File "/root/miniconda3/envs/ansen_swift/lib/python3.10/site-packages/math_verify/grader.py", line 243, in
Feel free to reopen if you have any issues.
在尝试部分rlhf算法时也是训练一开始卡住,等很长时间后爆这个问题;之前跑了一遍所有rlhf算法从未遇到这个问题,换了一个数据集后突然卡住了,然后再换回原数据集依然卡住,而且部分算法稳定的不卡,部分稳定的卡,看起来没有规律
请问这个问题最后怎么解决的 我现在也遇到这个问题 vllm 0.8.3 ms_swift 3.7.0