Yunqi Yan
Yunqi Yan
Same here. I used a quad-RTX 4090 setup (~96GB VRAM) for testing, but it still ran into OOM.
> I was able to run the code successfully on a machine with 4xRTX3090 (totally 96GB of VRAM), by setting "train_batch_size" and "validation_batch_size" both to 1 in "train.py". (As suggested,...
> pip install py-spy > py-spy dump --pid For my case, the py-spy result is: ``` Process 3250260: /home/user/miniconda3/envs/swift/bin/python3.11 -u /home/user/Desktop/GRPO/grpo_swift/ms-swift/swift/cli/rlhf.py --rlhf_type grpo --model /home/user/Desktop/GRPO/grpo_swift/output/sft/v4-20250530-192816/checkpoint-2319-merged --reward_funcs external_r1v_acc format --reward_weights 1...