verl

verl copied to clipboard

Published 3 weeks ago •

Reame
Issues

[mcore] verl+megatron development tracking

Open ISEEKYAN opened this issue 8 months ago • 7 comments

veRL Megatron-core Development Tracking

This page focuses on development of verl+mcore. The milestone target is to enable training deepseek-v3 on veRL as #708 and the further target is to continuously enhance the verl training experience of the mcore backend.

Progress and TODO

Recent

[x] update mcore version to 0.11 #392
[x] use mcore GPTModel api instead of huggingface workaround with sequence packing #706
[x] support context parallel #970
[x] support loading mcore dist_checkpointing #1030
[x] support Megatron 0.11.0 and vLLM 0.8.2 #851
[x] support qwen2moe training #1139
[x] support Moonlight-16B-A3B training (WIP) #1284
[ ] support Qwen2.5-VL training #1286
[x] support EP(expert parallel) #1467

Further

[ ] FP8 training
[ ] training efficiency related optimization
[ ] support sglang inference engine
[ ] support trtllm inference engine

Apr 11 '25 10:04 ISEEKYAN

Could you merge the todo list from this as well? https://github.com/volcengine/verl/issues/825

Apr 11 '25 17:04 eric-haibin-lin

Could you merge the todo list from this as well? #825

ok

Apr 14 '25 06:04 ISEEKYAN

mark👀

Apr 15 '25 05:04 rbao2018

mark👀

Apr 19 '25 01:04 yushuiwx

mark👀

Apr 28 '25 03:04 Aurelius84

How does the mcore backend support make_vocab_size_divisible_by, and where to pad the vocabulary to meet the splitting requirements?

Apr 28 '25 07:04 Wolfwjs

Could you support qwen to use Megatron for SFT training?

May 04 '25 10:05 supermancmk

Referencing a related issue: https://github.com/volcengine/verl/issues/708

Jun 08 '25 19:06 eric-haibin-lin

The SGlang backend feature support sglang inference engine is already there in verl right ?

Oct 16 '25 21:10 bhks

Labels

call for contribution

roadmap

Owner

Other Repo Issues