miyeeee
miyeeee
### 脚本: torchrun --master_addr=${MASTER_ADDR} --master_port ${MASTER_PORT} --nproc_per_node=${NPROC_PER_NODE} --nnodes=${NNODES} --node_rank=${NODE_RANK} \ ${SCRIPT_DIR}/swift/cli/rlhf.py \ --rlhf_type grpo \ --check_model false \ --model /cache/model \ --reward_funcs format \ --use_vllm false \ --vllm_device auto \...
Met the same problem, have you solved it yet?
> 32B GRPO full training script https://github.com/modelscope/ms-swift/blob/main/examples/train/grpo/multi_node/Qwen2_5_32B_full.sh what if I don't have 8\*80G GPUs (1 node, 8 GPU per node), but instead have 32\*32G NPUs (4 nodes, 8 NPU per...