Yang Zheng
Yang Zheng
Seems related to this PR https://github.com/sgl-project/sglang/pull/757 `scaling_factor` is set to 1 and lead to wrong `context_length`
BTW, i also saw "Unrecognized keys in `rope_scaling` for 'rope_type'='yarn': {'original_max_position_embeddings'}" in vllm. Seems unrelated.
@ronensc, could you please take a review of this metrics update?
Can someone explain what does buildkite/fastcheck/pr/docker-build-image test do?
> @zhengy001 Can you run a benchmark for this PR? LGTM once the performance is determined to be still reasonable. (I don't have idle device to run with FA2 backend...
> LGTM! The benchmark result looks reasonable! Thanks, @Isotr0py, do you know the failed buildkite/fastcheck/pr/tpu-test error?  Seems unrelated, do you know how to retry CI?
> You can sync this PR branch with the main branch to re-run the CI from new commit. Okay, thanks.
BlockSpaceManagerV1 got just removed. :(
> Thanks for the PR. Unfortunately I don't think this is the strategy we want to have in vLLM core. Although we indeed have this issue, we attempt to solve...