aibrix
aibrix copied to clipboard
Issue with metric refresh interval
🐛 Describe the bug
I noticed that metrics are not refreshed correctly per the interval. In below logs, interval set is 1s, but for decode-0 pod, metric is refreshed twice for same interval.
I1029 23:28:23.046741 1 cache_metrics.go:453] "Updating model metric" pod="vllm-2p2d-tp2-1roleset-roleset-wk7x7-decode-5dcf575575-0" model="qwen3-32b" generation_token_total={"Value":1315505} avg_generation_throughput_toks_per_s={"Value":1217.4834983307785}
I1029 23:28:23.046754 1 cache_metrics.go:186]
I1029 23:28:23.049670 1 gateway_rsp_body.go:157] request end, requestID: fead47f2-39d2-4bf7-ada8-a3e9c67a21d2 - targetPod: 192.168.0.102:8000, elapsed: 6.59526097s
I1029 23:28:23.089137 1 cache_metrics.go:453] "Updating model metric" pod="vllm-2p2d-tp2-1roleset-roleset-wk7x7-decode-5dcf575575-0" model="qwen3-32b" generation_token_total={"Value":1315565} avg_generation_throughput_toks_per_s={"Value":1415.1147987115946}
I1029 23:28:23.089151 1 cache_metrics.go:186]
I1029 23:28:23.089311 1 cache_metrics.go:453] "Updating model metric" pod="vllm-2p2d-tp2-1roleset-roleset-wk7x7-decode-5dcf575575-1" model="qwen3-32b" generation_token_total={"Value":1347888} avg_generation_throughput_toks_per_s={"Value":1370.1406205634362}
Steps to Reproduce
I added a log message in cache_metrics.go to print the values and run benchmark test.
Expected behavior
there should be only one record for each pod for every metric in the interval
Environment
NA
Could this be a problem with duplicate pods in metaPods? 🤔