Mingyang Song
Results
1
comments of
Mingyang Song
Hi, thanks for your interest! We don’t use DeepSpeed features during the RL procedure, as they may conflict with VLLM-based rollouts. However, we provide a Zero-3 config for TF-EVAL inference...