prometheus-k8s-operator
prometheus-k8s-operator copied to clipboard
This charmed operator automates the operational procedures of running Prometheus, an open-source metrics backend.
Context: Telco team is trying to build Terraform modules for the [Charmed 5G](https://canonical-charmed-5g.readthedocs-hosted.com/en/latest/), using the Juju Terraform provider. Main reason for having Terraform modules is that, unlike Juju bundles, they...
## Issue - The javascript actions operator is difficult to maintain. - Vectorization of integration tests currently depends on the github ci language. ## Solution Use [spread](https://github.com/snapcore/spread). - [ ]...
### Bug Description When a certain set of scrape jobs are deployed, Our scrape job validation is "fooled" and the scrape jobs are written to disk causing Prometheus to fail....
### Bug Description Prometheus supports specifying the CA certificate of an https target. Here I'm specifically referencing the "ca" parameter (not "ca_file"). Therefore we are trying to use the prometheus_scrape...
### Bug Description prometheus sets blocked status on deploy. It gets cleared after the first update-status. ### To Reproduce `juju deploy cos-lite --channel=edge --trust` ### Environment ``` juju info v0.1...
### Enhancement Proposal We could hack together something analogous to Tempo's charm_tracing: a new `charm_metrics` library, which would talk to a prometheus via remote-write using https://pypi.org/project/opentelemetry-exporter-prometheus-remote-write/ Similar to: - https://github.com/canonical/loki-k8s-operator/pull/306
### Enhancement Proposal Nowadays our prometheus config only have: ```yaml global: evaluation_interval: 1m scrape_interval: 1m scrape_timeout: 10s ``` In HA deployments this is a problem if you need to distinguish...
If we're going to have constant [log/status message] strings this way, they could be class-level rather than instance level, or just plain ALL-CAPS constants the way the old status was...
## Issue File operations are attempted before pebble-ready: https://github.com/canonical/prometheus-k8s-operator/blob/d33f51f39d990de3b8dcb9436bf69291a8e8b891/src/charm.py#L380-L382 Seems like a stop-gap fix would be `can_connect` guard + defer. ## Logs ``` unit-prom-0: 16:13:12.848 DEBUG unit.prom/0.juju-log certificates:71: Emitting custom...
### Enhancement Proposal We need to support use cases where users want to tier their Prometheus deployments to avoid excessive pressure. By adding `send-remote-write` to Prometheus, and remote writing whatever...