prometheus-k8s-operator icon indicating copy to clipboard operation
prometheus-k8s-operator copied to clipboard

Compress rules, jobs for cos-proxy

Open sed-i opened this issue 1 year ago • 1 comments

Issue

Some productions deployments have several megs of relation data just for alert rules, scrape jobs. Contributing factors to the high volume of reldata:

As a result, every event cos-proxy gets from nrpe relation-changed, results in reading/writing several megs of data, taking several seconds to complete. When many units are at play, model settling takes a long time.

In addition, relation data limit is 16M, and in common envs we already approach 25% of that.

Solution

  • LZMA-compress rules, jobs before storing in reldata, but only in MetricsEndpointAggregator.
  • Decode in MetricsEndpointConsumer.
  • Use encode/decode methods from grafana.

TODO:

  • Since we're bumping LIBAPI here, could consider addressing some of the TODO/FIXMEs sprinkled in the code.

Context

  • https://github.com/canonical/cos-proxy-operator/issues/56
  • https://github.com/canonical/cos-proxy-operator/pull/139

Testing Instructions

Upgrade Notes

sed-i avatar Jul 19 '24 22:07 sed-i

As of writing, the diff between v0 and v1 is: image

sed-i avatar Jul 24 '24 23:07 sed-i

We should probably not move forward with this. Currently, we only compress dashboards, and for now it's probably better to keep everything in plaintext, so we don't fracture the way we handle relation data across our charms.

lucabello avatar Jan 16 '25 13:01 lucabello