verl icon indicating copy to clipboard operation
verl copied to clipboard

missing `'` for `actor_rollout_ref.rollout.n` in examples?

Open DaizeDong opened this issue 1 month ago • 0 comments

System Info

Today when I tried to run examples/grpo_trainer/run_qwen3moe-30b_megatron_96gb.sh, a strange error occured:

Could not override 'actor_rollout_ref.rollout.n'.
To append to your config use +actor_rollout_ref.rollout.n=16
Key 'n' is not in struct
    full_key: actor_rollout_ref.rollout.n
    object_type=dict

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

I found it was because that the config in _generated_ppo_megatron_trainer.yaml uses 'n' as the name, while args in run_qwen3moe-30b_megatron_96gb.sh uses n.

After doing the following change in examples/grpo_trainer/run_qwen3moe-30b_megatron_96gb.sh, the error got fixed: actor_rollout_ref.rollout.n =>actor_rollout_ref.rollout.'n'

Is this a bug? I also find all scripts in examples miss the '. I can open a PR to fix this if needed.

Information

  • [x] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

run examples/grpo_trainer/run_qwen3moe-30b_megatron_96gb.sh

Expected behavior

Above

DaizeDong avatar Nov 24 '25 08:11 DaizeDong