Liguang Xie

[email protected]

Interested in building AI/ML core systems for scaling LLM and generative AI applications

Results 72 comments of


                                            Liguang Xie

[Port Manager] Support IPv6 allocation

@chenpiaoping This is for your tracking purpose.

use this on local models

The dashboard uses [llmperf benchmarking](https://github.com/ray-project/llmperf) tool to run tests over different LLM APIs. You can use the exact same [llmperf benchmarking](https://github.com/ray-project/llmperf) tool to run test against models deployed in your...

Mixtral comparison to OAI / Anthropic / Google

Great feedback! Let us plan for that :-)

server spec for ray llm serving

The dashboard uses [llmperf benchmarking](https://github.com/ray-project/llmperf) tool to run tests over different LLM APIs. As mentioned in the [llmperf benchmarking tool](https://github.com/ray-project/llmperf?tab=readme-ov-file#caveats-and-disclaimers) page, the backend of different API providers might vary widely,...

Throughput of llama2 70b higher than llama2 7b

The performance of various models relies on different factors, including model size, compute configuration (gpu model and counts), and model, system or algorithm optimizations. An API provider may have a...

prefix-cache-preble occur concurrent map read/write panic

Hi @linjianshu Thanks for reporting this issue. I'm a co-creator of both AIBrix and LLMPerf, and it's great to see you're using LLMPerf to benchmark against AIBrix — really appreciate...

[RFC]: Batch API for inference job

Cool! Great RFC overall.

Support multiple Lora adapter replicas

> It's meaningless to support > 1 loras on single pod. Quick q: did you mean support "< 1 loras on single pod"?

Support multi-node & autoscaling & routing together for models like Deepseek-R1

> * For applications metrics, only head pod which has the apiserver deployed emit the metrics. it has to be consistent with the number of the units. Thanks @Jeffwan. This...

Empty LB_IP when try Quickstart for AMD ROCm Cluster

@Jeffwan @varungup90 could you guys help on this issue? Thanks!

‹
1
2
3
4
5
6
7
8
›