aibrix icon indicating copy to clipboard operation
aibrix copied to clipboard

requestTrace in the cache of gateway plugin does not support stream request

Open zhangjyr opened this issue 11 months ago • 3 comments

🐛 Describe the bug

requestTrace currently only reports data when requests are not using steam. It should work in for both stream and non-stream vLLm requests.

Steps to Reproduce

No response

Expected behavior

No response

Environment

No response

zhangjyr avatar Jan 06 '25 05:01 zhangjyr

For stream as well request trace is logged.

In stream scenario, token token is reported in second last stream. When HandleResponseBody gets the stream with total tokens set, it will log in request trace. image

varungup90 avatar Jan 06 '25 17:01 varungup90

@zhangjyr Can you check the response, and if no actions is required then close the issue.

varungup90 avatar Feb 19 '25 20:02 varungup90

Right now request trace is added on EndOfStream, but for streaming, it needs to be added for n-1 stream chunk. cc https://github.com/vllm-project/aibrix/issues/790 - we can add support once we have separate feature flag for heterogeneous.

varungup90 avatar Mar 05 '25 00:03 varungup90

This is completed.

varungup90 avatar Apr 11 '25 21:04 varungup90