Casper comments

Results 293 comments of


                                            Casper

verl v0.2.1 & v0.3 release checklist

Is it possible to optimize startup time? I noticed when using veRL, it is significantly slower to launch a job than when using Huggingface TRL https://github.com/volcengine/verl/issues/384

Add support for Tool Calling in vLLM

An alternative way to implement this kind of feature is as seen in fork below. This one implements it into a `LLMGenerationManager` that can add context on the fly as...

Recent changes is causing "found at least two devices"

@muellerzr For reference, this error occurs when we want to run inference to quantize the model. I have not received similar reports for inference of quantized models. So in other...

[BREAKING][Async SGLang Rollout] Efficient and model-agnostic multi-turn messages tokenization and masking

Hi @vermouth1992 this PR broke async vLLM Server. Considering this was an update for SGLang, I am quite surprised. CC @wuxibin89 Here is the problem: - omegaconf.errors.ConfigAttributeError: Key 'format' is...

[BREAKING][Async SGLang Rollout] Efficient and model-agnostic multi-turn messages tokenization and masking

@jybsuper I would appreciate a patch for this so that the ChatCompletionScheduler can use similar config arguments. I do have a preference for the scheduler because of how easy it...

[Question] Does verl support muilti-round conversation RL training?

It's not currently supported. See this issue: https://github.com/volcengine/verl/issues/398

ImportError: cannot import name 'initialize_tasks' from 'lm_eval.tasks'

You can `pip install -e .[eval]` to get the right dependencies

Can't resume from checkpoint for multi-node fine-tuning

Please update your axolotl version as this was fixed after the commit that you are using. #795 fixed this

【bug】 multi-note grpo block, single node is ok

Multi-node GRPO only works with `ray job submit -- python3 -u -m verl.trainer.main_ppo ...` I suspect you are launching without `ray job submit`, maybe similar to #491?

Quantitative model report wrong, RuntimeError: Expected all tensors to be on the same device

The default loading of the model in `transformers` seems to have changed recently. For now, you can just use `device_map` when needed.