lmdeploy
lmdeploy copied to clipboard
[Feature] metrics support
Objective
Align with vLLM v1 metrics system and beyond. We also refer to SGLang monitoring.
TODO
- [x] Change
time.perf_counter() - [ ] Abstract output processing outside of async engine
generate() - [ ] Expert information collections
- [ ] Grafana visualization
Usage
Start the server with --enable-metrics
lmdeploy serve api_server models--Qwen--Qwen2.5-7B-Instruct --enable-metrics
-
Metrics Publishing - Logging Information will be printed on the terminal every 10 seconds
-
Metrics Publishing - Prometheus & Grafana (WIP) Open http://xxxx:23333/metrics/ to view Prometheus details.