verl [WIP] [single_controller] feat: PyTorch Monarch integration

[WIP] [single_controller] feat: PyTorch Monarch integration

Open keyan opened this issue 1 month ago • 1 comments

What does this PR do?

Rough initial transfer of internal monarch integration.

Cleanup items:

[ ] clean up of many TODOs, some are more complex
[ ] internal infra usage like create_mast_proc_mesh needs to be removed
[ ] PPO trainer code was copy/pasted from Ray, it should be refactored to have a parent class to removed duplicate logic
[ ] PPO util methods like apply_kl_penalty should be moved to shared util module

Test

Experimental results available from post-training the Qwen-2.5-7B on H200 GPUs using Megatron-LM.

Experimental data pending wider release.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

python3 -m verl.trainer.main_ppo_monarch \
    --config-path=config \
    --config-name='ppo_megatron_trainer.yaml' \
    ...

Design & Code Changes

TODO

Checklist Before Submitting

[!IMPORTANT] Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

[ ] Read the Contribute Guide.
[ ] Apply pre-commit checks: pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
[ ] Add / Update the documentation.
[ ] Add unit or end-to-end test(s) to the CI workflow to cover all the code. If not feasible, explain why: ...
[ ] Once your PR is ready for CI, send a message in the ci-request channel in the verl Slack workspace. (If not accessible, please try the Feishu group (飞书群).)

Oct 09 '25 20:10 keyan

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Nov 10 '25 02:11 CLAassistant

verl verl copied to clipboard

[WIP] [single_controller] feat: PyTorch Monarch integration

What does this PR do?

Test

API and Usage Example

Design & Code Changes

Checklist Before Submitting

verl
verl copied to clipboard