Guangming Sheng issues

Results 5 issues of


                                            Guangming Sheng

[misc] fix: disable chunked-prefill by default

Thanks: @HillZhang1999 - Related issue: https://github.com/volcengine/verl/issues/189 `[36m(main_task pid=3523385)[0m ValueError: max_num_batched_tokens (8192) is smaller than max_model_len (9216). This effectively limits the maximum sequence length to max_num_batched_tokens and makes vLLM reject longer...

[Roadmap] veRL Development Roadmap

## Themes We categorized our roadmap into 8 themes: Broad Model Support, Regular Update, More RL Algorithms support, Dataset Coverage, Plugin Support, Scaling Up RL, More LLM Infrastructure Support, Wide...

[RFC] Megatron-LM and MCore maintaining issues for veRL

## How veRL use Megatron-LM and MCore v0.4.0 - **Initialization:** We build the necessary parallel groups by using [`mpu.initialize_model_parallel`](https://github.com/volcengine/verl/blob/main/verl/trainer/ppo/workers/megatron_workers.py#L86) without initializing the global args() in Megatron-LM. - **Model:** - We...

enhancement

megatron

Basic Tutorial: Adding a New LLM Inference/Serving Backend

1. **Prerequisite:** Make sure the LLM Inference framework can be launched following the SPMD style. For example, the LLM inference script can be launched by `torchrun --standalone --nproc=8 offline_inference.py` 2....

enhancement

generation

[rollout] fix: resource pool name in standalone mode

### What does this PR do? fix resource pool name in standalone mode ### Checklist Before Starting - [ ] Search for similar PRs. Paste at least one query link...