Yang Zheng

Results 11 comments of Yang Zheng

Seems related to this PR https://github.com/sgl-project/sglang/pull/757 `scaling_factor` is set to 1 and lead to wrong `context_length`

BTW, i also saw "Unrecognized keys in `rope_scaling` for 'rope_type'='yarn': {'original_max_position_embeddings'}" in vllm. Seems unrelated.

@ronensc, could you please take a review of this metrics update?

Can someone explain what does buildkite/fastcheck/pr/docker-build-image test do?

> @zhengy001 Can you run a benchmark for this PR? LGTM once the performance is determined to be still reasonable. (I don't have idle device to run with FA2 backend...

> LGTM! The benchmark result looks reasonable! Thanks, @Isotr0py, do you know the failed buildkite/fastcheck/pr/tpu-test error? ![image](https://github.com/user-attachments/assets/24342065-2bc3-4c01-af22-db356a771026) Seems unrelated, do you know how to retry CI?

> You can sync this PR branch with the main branch to re-run the CI from new commit. Okay, thanks.

> Thanks for the PR. Unfortunately I don't think this is the strategy we want to have in vLLM core. Although we indeed have this issue, we attempt to solve...