verl icon indicating copy to clipboard operation
verl copied to clipboard

verl v0.2.1 & v0.3 release checklist

Open eric-haibin-lin opened this issue 9 months ago • 16 comments

v0.2.1

  • [x] add assertion when log_prob_micro_batch_size is smaller than world_size, and fix the case when "the evaluation dataset size is not divisible by the world_size" https://github.com/volcengine/verl/issues/12#issuecomment-2475353389
  • [ ] add an option to remove the call of torch.compile in https://github.com/volcengine/verl/blob/main/verl/workers/actor/dp_actor.py#L56 in case of gcc/nvcc issues https://github.com/volcengine/verl/issues/245#issuecomment-2677172305
  • [ ] include the fix for checkpoint fixes in https://github.com/volcengine/verl/issues/250
  • [ ] check if https://github.com/volcengine/verl/issues/283 persists (and if so fix it)
  • [ ] multi-node training tutorial with ray start https://github.com/volcengine/verl/issues/278
  • [x] fix the main_generation example https://github.com/volcengine/verl/issues/349 https://github.com/volcengine/verl/pull/351 https://github.com/volcengine/verl/issues/331

v0.3

feel free to propose features (contributions are welcome!)

  • [x] upgrade mcore to v0.6 or v0.11
  • [ ] deepseek v3 examples
  • [ ] megatron checkpoint supports
  • [x] megatron qwen2 support https://github.com/volcengine/verl/pull/261
  • [x] https://github.com/volcengine/verl/issues/312
  • [x] multimodal (qwen vl) support
  • [ ] sglang integration
  • [ ] tool calling examples
  • [ ] non nvidia gpu support
  • [ ] start time optimization

eric-haibin-lin avatar Feb 23 '25 23:02 eric-haibin-lin

What can I help about the 'tool calling examples' part?

BearBiscuit05 avatar Feb 24 '25 04:02 BearBiscuit05

What can I help about the 'tool calling examples' part?

related to: https://github.com/volcengine/verl/issues/344 https://github.com/volcengine/verl/issues/340

under the hood chat calls generate, so the design is supposed to work. just need to provide a working/stable example

eric-haibin-lin avatar Feb 24 '25 05:02 eric-haibin-lin

Will megatron context parallelism be supported in the future?

liyu199809 avatar Feb 24 '25 09:02 liyu199809

Will megatron context parallelism be supported in the future?

Yes. We will use mcore that supports cp by default.

vermouth1992 avatar Feb 24 '25 10:02 vermouth1992

@BearBiscuit05 See #344, I outlined the main challenge. I think it should be relatively straightforward if veRL can start using chat or vLLM directly adds support for tool calling in generate.

I imagine we can have GRPO-trained reasoners in the future that learns when to use tools as part of their <think> tags, e.g. to execute code for a feedback loop or retrieve additional information.

casper-hansen avatar Feb 24 '25 12:02 casper-hansen

@BearBiscuit05 See #344, I outlined the main challenge. I think it should be relatively straightforward if veRL can start using chat or vLLM directly adds support for tool calling in generate.

I imagine we can have GRPO-trained reasoners in the future that learns when to use tools as part of their <think> tags, e.g. to execute code for a feedback loop or retrieve additional information.

I talked to vllm maintainer yesterday. It seems that there should be no blocking if we switch from generate to chat. Do you mind give it a try to call chat using SPMD style offline inference?

vermouth1992 avatar Feb 24 '25 12:02 vermouth1992

Not very familiar with inference, but I think I’m starting to get the hang of it. Does this mean I need to build a new chat function and add extra params that include tool calls to invoke generate? Or should I just replace generate directly with the chat function from vllm?

BearBiscuit05 avatar Feb 24 '25 13:02 BearBiscuit05

You should be able to replace generate directly with chat. The only problem is that we currently pass tokenized inputs into generate where as chat expects List[ChatCompletionContentPartTextParam] or List[List[ChatCompletionContentPartTextParam]]. Not sure what the best design would be in this case.

Case 1: Detokenize the tokenized inputs we use for generate. Case 2: Change veRL to not tokenize datasets before-hand (relatively big change)

class ChatCompletionContentPartTextParam(TypedDict, total=False):
    text: Required[str]
    """The text content."""

    type: Required[Literal["text"]]
    """The type of the content part."""

casper-hansen avatar Feb 24 '25 14:02 casper-hansen

The second choice would incur significant overhead when tokenizing on-the-fly (typically 2x slowdown in generation, which is basically unacceptable). I guess we will need to seek solution for case 1

vermouth1992 avatar Feb 24 '25 15:02 vermouth1992

Got it. I'll give it a try.

BearBiscuit05 avatar Feb 25 '25 01:02 BearBiscuit05

未来会支持 megatron 上下文并行吗?

是的。我们将默认使用支持cp的mcore。

It seems that the context parallelism in the model part has not been implemented yet. Is this function currently available?

liyu199809 avatar Feb 25 '25 05:02 liyu199809

未来会支持 megatron 上下文并行吗?

是的。我们将默认使用支持cp的mcore。

It seems that the context parallelism in the model part has not been implemented yet. Is this function currently available?

Not right now, but if you check this roadmap, once verl upgrades MCore, cp will be support.

BearBiscuit05 avatar Feb 25 '25 05:02 BearBiscuit05

Is it possible to optimize startup time? I noticed when using veRL, it is significantly slower to launch a job than when using Huggingface TRL https://github.com/volcengine/verl/issues/384

casper-hansen avatar Feb 25 '25 14:02 casper-hansen

Disabling torch.compile is useful, as it can also hang PPO training when enabling use_remove_padding. #387

maksimstw avatar Mar 02 '25 01:03 maksimstw

Disabling torch.compile is useful, as it can also hang PPO training when enabling use_remove_padding. #387

@maksimstw thanks for the feedback! Would you like to provide a PR with this option?

eric-haibin-lin avatar Mar 03 '25 04:03 eric-haibin-lin

when will you release the "sglang integration" part?

Llipengll avatar Mar 05 '25 08:03 Llipengll

v0.2.1

v0.3

feel free to propose features (contributions are welcome!)

how to install v0.3

JarvisFei avatar Mar 14 '25 02:03 JarvisFei

add an option to remove the call of torch.compile

Item solved in #554

hongpeng-guo avatar Mar 14 '25 07:03 hongpeng-guo

hi @JarvisFei , v0.3 is not fully released but you are welcome to try verl main branch with pip install -e with the source code

eric-haibin-lin avatar Mar 20 '25 20:03 eric-haibin-lin

As we are already making quite some progress in the main branch, I suggest we freeze code week for v0.3 and push the rest of the features to v0.4

eric-haibin-lin avatar Mar 22 '25 01:03 eric-haibin-lin

Moving discussions to https://github.com/volcengine/verl/issues/710

eric-haibin-lin avatar Mar 30 '25 22:03 eric-haibin-lin