clipper icon indicating copy to clipboard operation
clipper copied to clipboard

[Metrics] Extend metrics to measure physical performance stats

Open dcrankshaw opened this issue 7 years ago • 4 comments

It would be great to extend our monitoring infrastructure to measure more physical performance. To start with, @blackhat06 suggested tracking the following resource metrics:

  • [ ] Disk IO: % time that device was busy
  • [ ] Memory: % of total memory capacity in use
  • [ ] CPU Utilization: 5,10,15 min
  • [ ] Memory Utilization: Breakdown by memory
  • [ ] % volume usage: Disk all mounted
  • [ ] Bits IN/Out (ethernet)
  • [ ] Volume I/O
  • [ ] Process count / running/blocked

dcrankshaw avatar Feb 28 '18 00:02 dcrankshaw

Prometheus can track these with node exporter https://github.com/prometheus/node_exporter/blob/master/README.md

For Kubernetes we can just scrape kube-api-server/metrics. Kubernetes expose Prometheus metrics there

simon-mo avatar Feb 28 '18 01:02 simon-mo

Update:

  • For Docker, we can safely assume user only has on node so we can just run a node exporter at startup.
  • Kubernetes's API service does expose metrics but the metrics are about the api server requests itself and etcd usage. We should use a DaemonSet. It will deploy a prometheus node exporter to each node (https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/). In fact this is recommended practice by Kubernetes.

simon-mo avatar Mar 10 '18 22:03 simon-mo

Awesome. cc @blackhat06

dcrankshaw avatar Mar 12 '18 19:03 dcrankshaw

@simon-mo Is this handled?

rkooo567 avatar Jun 05 '19 04:06 rkooo567