Jiaxin Shan
Jiaxin Shan
Sure. Let's keep this open and revisit it later
I think the root cause above is the folder not in the tracked paths. Since benchmark code is not part of the e2e test. I think we do not need...
For the CI issue, I left comment here https://github.com/vllm-project/aibrix/issues/739 and please take a look
- GPU device: 2* A100 80G. - dataset input size 1024 output size 6 (default value) - model: Qwen/Qwen2.5-7B-Instruct chunk-size 512  chunk-size 2048  I notice the default chunk...
@halittiryaki @dafu-wu @sadath-12 do you have strong needs for this feature? The problem is the activation needs to be handled separately, it would be super slow and I doubt whether...
@halittiryaki i just like to confirm on the client side behavior. Let's say the replica is 0 now, do you expect the proxy hold the request, until the node is...
@brosoul I am trying to understand the question. you mean implementing a plugin to show the status? I think that's reasonable, this requires some kubectl plugin I think. ``` kubectl...
@nwangfw Let's squash the commits and create a clean commit message
We had a different version with more test and better coverage (chat/completion/embedding). Let me rebase to master and share with the community. @TangJiakai @simon-mo
@Colstuwjx A quick question, for manage endpoints, we do not have any control on the behavior. Do you make the assumption that Azure deployment will automatically cache it for you...