Casper
Casper
Is it possible to optimize startup time? I noticed when using veRL, it is significantly slower to launch a job than when using Huggingface TRL https://github.com/volcengine/verl/issues/384
An alternative way to implement this kind of feature is as seen in fork below. This one implements it into a `LLMGenerationManager` that can add context on the fly as...
@muellerzr For reference, this error occurs when we want to run inference to quantize the model. I have not received similar reports for inference of quantized models. So in other...
Hi @vermouth1992 this PR broke async vLLM Server. Considering this was an update for SGLang, I am quite surprised. CC @wuxibin89 Here is the problem: - omegaconf.errors.ConfigAttributeError: Key 'format' is...
@jybsuper I would appreciate a patch for this so that the ChatCompletionScheduler can use similar config arguments. I do have a preference for the scheduler because of how easy it...
It's not currently supported. See this issue: https://github.com/volcengine/verl/issues/398
You can `pip install -e .[eval]` to get the right dependencies
Please update your axolotl version as this was fixed after the commit that you are using. #795 fixed this
Multi-node GRPO only works with `ray job submit -- python3 -u -m verl.trainer.main_ppo ...` I suspect you are launching without `ray job submit`, maybe similar to #491?
The default loading of the model in `transformers` seems to have changed recently. For now, you can just use `device_map` when needed.