Yuming Huang

Results 1 issues of Yuming Huang

## I are using the latest verl, and here is my training script: python3 -m verl.trainer.main_ppo \ algorithm.adv_estimator=grpo \ data.train_files=/raphealhuang/simpleRL/train.parquet \ data.val_files=/raphealhuang/simpleRL/test.parquet \ data.train_batch_size=1024 \ data.max_prompt_length=512 \ data.max_response_length=1024 \ data.filter_overlong_prompts=True...