kepler icon indicating copy to clipboard operation
kepler copied to clipboard

Make `pod_total_*` metrics persistent/survive a Kepler restart

Open williamcaban opened this issue 1 year ago • 8 comments

A clear and concise description of what the problem is.

The pod_total_* metrics represent the total accumulated over all sample since Kepler started monitoring the particular pod. When restarting the Kepler pods this value goes back to zero.

Describe the solution you'd like Metrics for pod_total_* should persist/survive a reboot of the Kepler Pod.

Describe alternatives you've considered From Prometheus perspective I can do something like sum(pod_curr_energy_in_core_millijoule{pod_name='my-pod'}) to get an approximate total as seen by Prometheus. Kepler samples every 3 secs and prometheus has its own scrape schedule. For this reason, not all the samples might be scraped by prometheus so the total seen by the sum query can be different from the actual total.

williamcaban avatar Oct 17 '22 16:10 williamcaban

I would think this requires at three changes:

  • A CLI flag to load and persist metrics
  • Functions that allow Kepler to load the last metrics upon start and persist the stats before exit
  • A persistent volume for Kepler

rootfs avatar Oct 17 '22 17:10 rootfs

maybe need a way to clean up those metrics as well ? as the metrics might be really big .. also ,seems it's either true/false means we should collect all pods metrics which might be inefficient so design might think about how to just focus on individual pod though it can be long term thing

jichenjc avatar Oct 18 '22 01:10 jichenjc

This is a good topic!

Prometheus and Grafana have the logic to enable persistent storage.... So we should also make this configurable...

The challenge here is how do we save data to persistent storage? Maybe we need a database?

marceloamaral avatar Oct 18 '22 02:10 marceloamaral

@marceloamaral a plain json on a persistent volume?

rootfs avatar Oct 18 '22 13:10 rootfs

Right, so we could keep it on memory and dump to file each X minutes....

marceloamaral avatar Oct 19 '22 04:10 marceloamaral

so something like following: mount a volume (eph or nfs etc) at start of kepler to a desired place every 3 sec update the pod_total_x then write to the file (should be not that heavy operation?) load the file during startup if exist and init the pod_total_x with the file?

jichenjc avatar Oct 20 '22 07:10 jichenjc

it might have sense to fetch the last "total" metric from of a suitable node (if present) directly from Prometheus when starting kepler's pod, but as mentioned - the numbers would get high very fast and honestly is there even a reason to export "total" metrics?

Feelas avatar Oct 20 '22 12:10 Feelas

In fact, we might not need the persistent volume for that!

Prometheus does a very good job of handling cases where a counter has been reset: https://prometheus.io/docs/prometheus/latest/querying/functions/#rate

So if Kepler restarts (which should be very rare) it shouldn't be a big problem from Prometheus' point of view. Unless there is a problem with the cluster and Kepler restarts too often...But it will affect everything...


honestly is there a reason to export "total" metrics?

There are some mentions in the prometheus official documentation guideline for this.

Counters are useful for accumulating the number of events or the amount of something in each event.

Gauges are useful for snapshots of state such as requests in progress, free/total memory, or temperature.

The Prometheus guideline also says:

For base unit Power:
Prefer to export a joule counter, so rate(joules[5m]) gives you power in Watts.

We could export power with a gauge, but exporting energy with counter has some advantages:

  • a counter will not miss any events, as prometheus failed to scrape a metric. It will accumulate in the next interval
  • using the meter will force more approximations in the calculation of energy consumption. For example, if we export power, we need to divide the energy by the kepler collection interval. So it will be another average...
  • CPU utilization is measured with a counter, e.g., container_cpu_usage_seconds_total

marceloamaral avatar Oct 20 '22 13:10 marceloamaral