Jiaxin Shan

Results 742 comments of Jiaxin Shan

@happyandslow you are supposed to write down the prerequisite to run scripts. Let's remove such dataset in the PR, only scripts to generate those PR are needed

@happyandslow is this PR still needed?

@happyandslow is this PR still necessary?

One approach is to design something more sophisticated, like least-effective-load(LEL) solution. Compute effective load `𝐿𝑖 =𝑅𝑖 +𝛼𝑃𝑖` for each pod. R - running , P - pending. We can start...

I think we should invest the client problem first. That could be the primary reason. I had a brief chat with @happyandslow today and she confirms that part

@gangmuk We can expose the target-pod in the response header from gateway end, that could be helpful

for successfully logs, I think we already have it. check this https://github.com/aibrix/aibrix/blob/c4060bb3c5d41949954626f16c0ae15aa82b73ec/test/e2e/routing_strategy_test.go#L69

@varungup90 any evidence shows the imbalance issue is due to 50ms?

@varungup90 we have some issue to track the client request interval optimization https://github.com/vllm-project/aibrix/issues/667. It is supposed not to send request in batch way. If QPS=8, in even distribution, only single...

![Image](https://github.com/user-attachments/assets/51194720-34b7-41b4-9323-80413ccd8d01) I change to absolute path and it's still not working. the lora adapter issue