verl icon indicating copy to clipboard operation
verl copied to clipboard

[WIP] [single_controller] feat: PyTorch Monarch integration

Open keyan opened this issue 1 month ago • 1 comments

What does this PR do?

Rough initial transfer of internal monarch integration.

Cleanup items:

  • [ ] clean up of many TODOs, some are more complex
  • [ ] internal infra usage like create_mast_proc_mesh needs to be removed
  • [ ] PPO trainer code was copy/pasted from Ray, it should be refactored to have a parent class to removed duplicate logic
  • [ ] PPO util methods like apply_kl_penalty should be moved to shared util module

Test

Experimental results available from post-training the Qwen-2.5-7B on H200 GPUs using Megatron-LM.

Experimental data pending wider release.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

python3 -m verl.trainer.main_ppo_monarch \
    --config-path=config \
    --config-name='ppo_megatron_trainer.yaml' \
    ...

Design & Code Changes

TODO

Checklist Before Submitting

[!IMPORTANT] Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

keyan avatar Oct 09 '25 20:10 keyan

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

CLAassistant avatar Nov 10 '25 02:11 CLAassistant