guyueh1

Results 29 comments of guyueh1

@youngeunkwon0405 you can create sub-issues once you identified concrete items; If they are general items not async specific you can assign them to me.

There is no items planned for v0.5 actually. We will plan for v0.6.

@terrykong No features or fixes will be ready for v0.5, we will plan for v0.6 if that's alright

@parthmannan do you have a different PR that supports asymmetric VPP? if so maybe we should close this and work on your PR as that covers more cases

@parthmannan let's merge all changes related to vpp in one PR; I can drive this, if you just merge your changes to my branch (guyueh1/support_mcore_vpp)

@guyueh1 to perform FP8 GRPO convergence test and downstream evaluation

Here is the latest status of FP8 GRPO support in nemo-rl | | GRPO | | SFT | | |-----------------|--------|----------|--------|----------| | | llama3 | Qwen3MoE | llama3 | Qwen3MoE |...

Renaming this issue to clarify the purpose is to minimize Ray-related overhead in GRPO including - returning samples to driver after generation - dispatching training functions - sending samples to...

@MattFeinberg The current NeMo RL container, if you built it from `docker/Dockerfile`, it uses a cuda 12.9 container as the base container so the container doesn't have a cuda 13....