clipper [Metrics] Extend metrics to measure physical performance stats

It would be great to extend our monitoring infrastructure to measure more physical performance. To start with, @blackhat06 suggested tracking the following resource metrics:

[ ] Disk IO: % time that device was busy
[ ] Memory: % of total memory capacity in use
[ ] CPU Utilization: 5,10,15 min
[ ] Memory Utilization: Breakdown by memory
[ ] % volume usage: Disk all mounted
[ ] Bits IN/Out (ethernet)
[ ] Volume I/O
[ ] Process count / running/blocked

Feb 28 '18 00:02 dcrankshaw

Prometheus can track these with node exporter https://github.com/prometheus/node_exporter/blob/master/README.md

For Kubernetes we can just scrape kube-api-server/metrics. Kubernetes expose Prometheus metrics there

Feb 28 '18 01:02 simon-mo

Update:

For Docker, we can safely assume user only has on node so we can just run a node exporter at startup.
Kubernetes's API service does expose metrics but the metrics are about the api server requests itself and etcd usage. We should use a DaemonSet. It will deploy a prometheus node exporter to each node (https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/). In fact this is recommended practice by Kubernetes.

Mar 10 '18 22:03 simon-mo

Awesome. cc @blackhat06

Mar 12 '18 19:03 dcrankshaw

@simon-mo Is this handled?

Jun 05 '19 04:06 rkooo567