Liguang Xie

Results 72 comments of Liguang Xie

@chenpiaoping This is for your tracking purpose.

The dashboard uses [llmperf benchmarking](https://github.com/ray-project/llmperf) tool to run tests over different LLM APIs. You can use the exact same [llmperf benchmarking](https://github.com/ray-project/llmperf) tool to run test against models deployed in your...

Great feedback! Let us plan for that :-)

The dashboard uses [llmperf benchmarking](https://github.com/ray-project/llmperf) tool to run tests over different LLM APIs. As mentioned in the [llmperf benchmarking tool](https://github.com/ray-project/llmperf?tab=readme-ov-file#caveats-and-disclaimers) page, the backend of different API providers might vary widely,...

The performance of various models relies on different factors, including model size, compute configuration (gpu model and counts), and model, system or algorithm optimizations. An API provider may have a...

Hi @linjianshu Thanks for reporting this issue. I'm a co-creator of both AIBrix and LLMPerf, and it's great to see you're using LLMPerf to benchmark against AIBrix — really appreciate...

Cool! Great RFC overall.

> It's meaningless to support > 1 loras on single pod. Quick q: did you mean support "< 1 loras on single pod"?

> * For applications metrics, only head pod which has the apiserver deployed emit the metrics. it has to be consistent with the number of the units. Thanks @Jeffwan. This...

@Jeffwan @varungup90 could you guys help on this issue? Thanks!