worker-vllm icon indicating copy to clipboard operation
worker-vllm copied to clipboard

Fix rope_scaling dict parsing

Open Code42Cate opened this issue 6 months ago • 2 comments

Fixes https://github.com/runpod-workers/worker-vllm/issues/192 by correctly parsing the expected (https://github.com/vllm-project/vllm/blob/main/vllm/engine/arg_utils.py#L354C5-L354C17) json to a dict instead of just passing the string

Tested by deploying to Runpod with Menlo/Jan-nano-128k and ROPE_SCALING={"rope_type":"yarn","factor":3.2,"original_max_position_embeddings":40960}

Still waiting for the build to test if it works without specifying the env var

Code42Cate avatar Jul 02 '25 13:07 Code42Cate

@Code42Cate was your test without the env var also succesful?

TimPietrusky avatar Jul 08 '25 11:07 TimPietrusky

The docker images are updated on this?

Is this now working?

batmar125 avatar Aug 11 '25 21:08 batmar125