verl icon indicating copy to clipboard operation
verl copied to clipboard

【bug】 multi-note grpo block, single node is ok

Open yiyepiaoling0715 opened this issue 9 months ago • 1 comments

block log,the upper is latest Image this is my parameter

Image what is wrong with this

yiyepiaoling0715 avatar Mar 06 '25 02:03 yiyepiaoling0715

Multi-node GRPO only works with ray job submit <your-args> -- python3 -u -m verl.trainer.main_ppo ...

I suspect you are launching without ray job submit, maybe similar to #491?

casper-hansen avatar Mar 06 '25 15:03 casper-hansen

after many tries,it still fails; Image

yiyepiaoling0715 avatar Mar 10 '25 04:03 yiyepiaoling0715

after many tries,it still fails; Image

+1

JarvisFei avatar Mar 11 '25 16:03 JarvisFei

Multi-node GRPO only works with ray job submit <your-args> -- python3 -u -m verl.trainer.main_ppo ...

I suspect you are launching without ray job submit, maybe similar to #491?

Could you give a complete example, please?

JarvisFei avatar Mar 11 '25 17:03 JarvisFei

I also suffer this, can you give a complete example, please?

Chezacar avatar Mar 13 '25 09:03 Chezacar

Multi-node GRPO only works with ray job submit <your-args> -- python3 -u -m verl.trainer.main_ppo ...

I suspect you are launching without ray job submit, maybe similar to #491?

can you give more info for this?

yiyepiaoling0715 avatar Mar 17 '25 01:03 yiyepiaoling0715