aibrix icon indicating copy to clipboard operation
aibrix copied to clipboard

Designing an effective metric to identify imbalances and measure distribution fairness

Open Jeffwan opened this issue 10 months ago • 0 comments

🚀 Feature Description and Motivation

metrics: requests, tokens (prefill, decode), latencies(e2e, TTFT, TPOT), resources (SM_ACTIVE)

measurement:

  • request per pod
  • standard deviation of requests
  • gini coefficient

Recently, we meet a lot of issues measuring the load balance issues. Beside the bugs we fixed, we notice it's a little bit hard to figure out the some deep problem. To address this issue, I suggest to improve the metrics measurement for better evaluating the performance.

Use Case

No response

Proposed Solution

No response

Jeffwan avatar Feb 11 '25 06:02 Jeffwan