vllm
vllm copied to clipboard
[Feature]: `kv_transfer_params` not returned for multiple subrequests
🚀 The feature, motivation and pitch
Currently, when handling HTTP requests with multiple subrequests, the response only includes kv_transfer_params for one subrequest, making it impossible to access KV transfer information for other subrequests.
Related PR: #17751 Reference code: https://github.com/vllm-project/vllm/blob/e384f2f10824df7789c6da35256cf957788c0208/vllm/entrypoints/openai/serving_completion.py#L514-L520
Alternatives
No response
Additional context
No response
Before submitting a new issue...
- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.