postgres-operator
postgres-operator copied to clipboard
Using pgo metrics with existing prometheus and grafana
I'm using pgo v4.5.0 and have deployed my pg pods with the --metrics flag. If I already have grafana and prometheus running in my cluster how can I see the metrics coming from pg pods? How can I install the pg dashboards that you have in your grafana instance? Do you have a link to those? Thanks!
- Prometheus: https://github.com/CrunchyData/pgmonitor/blob/master/prometheus/crunchy-prometheus.yml.containers
- Grafana: https://github.com/CrunchyData/pgmonitor/tree/master/grafana/containers
- Alertmanager: https://github.com/CrunchyData/pgmonitor/blob/master/prometheus/alert-rules.d/crunchy-alert-rules-pg.yml.containers.example
We should probably better document how to connect this to one's existing Prometheus / Grafana. If you are able to get this this working and would like to propose a patch (or a workflow that you used) I'd be happy to review it.
Further to this, I'm trying to create a PodMonitor for an existing Prometheus installation and it requires a named port on the Collect container which I cannot seem to find how to create without editing the running Pod definition (!) - we have had it running by doing just that but it's obviously an anti-pattern. I suspect the non-Ansible kustomize deployer that @jkatz is working on may be the answer to this but please confirm.
Hi, I've created a working PodMonitor for an existing Prometheus.
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: crunchy-postgres-exporter
labels:
release: prometheus-operator
namespace: cattle-monitoring-system
spec:
namespaceSelector:
matchNames:
- cluster-postgres
selector:
matchLabels:
crunchy_postgres_exporter: "true"
podTargetLabels:
- deployment_name
- role
- pg_cluster
podMetricsEndpoints:
- relabelings:
- sourceLabels:
- "__meta_kubernetes_pod_container_port_number"
action: "drop"
regex: "5432"
- sourceLabels:
- "__meta_kubernetes_pod_container_port_number"
action: "drop"
regex: "8009"
- sourceLabels:
- "__meta_kubernetes_pod_container_port_number"
action: "drop"
regex: "2022"
- sourceLabels:
- "__meta_kubernetes_pod_container_port_number"
action: "drop"
regex: "10000"
- sourceLabels:
- "__meta_kubernetes_namespace"
action: "replace"
targetLabel: "kubernetes_namespace"
- sourceLabels:
- "__meta_kubernetes_pod_name"
targetLabel: "pod"
- sourceLabels:
- "__meta_kubernetes_pod_ip"
targetLabel: "ip"
replacement: "$1"
- sourceLabels:
- "dbname"
targetLabel: "dbname"
replacement: "$1"
- sourceLabels:
- "relname"
targetLabel: "relname"
replacement: "$1"
- sourceLabels:
- "schemaname"
targetLabel: "schemaname"
replacement: "$1"
- targetLabel: "exp_type"
replacement: "pg"
@zposloncec when I used your PodMonitor the label clusters
which is in form of {{ namespace }}:{{ pg_cluster }}
is only translated to {{ pg_cluster }}
. Do you have any suggestion on how to keep the original format? Thank you.
Hi,
I've created a service monitor from: https://github.com/CrunchyData/pgmonitor/blob/master/prometheus/crunchy-prometheus.yml.containers
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: postgres-servicemonitor
namespace: monitoring
labels:
app: prometheus
spec:
endpoints:
- port: postgres-exporter
path: /metrics
honorLabels: true
interval: 10s
relabelings:
- sourceLabels: [ __meta_kubernetes_pod_label_crunchy_postgres_exporter ]
action: keep
regex: "true"
- sourceLabels: [ __meta_kubernetes_pod_container_port_number ]
action: drop
regex: "5432"
- sourceLabels: [ __meta_kubernetes_pod_container_port_number ]
action: drop
regex: "10000"
- sourceLabels: [ __meta_kubernetes_pod_container_port_number ]
action: drop
regex: "8009"
- sourceLabels: [ __meta_kubernetes_pod_container_port_number ]
action: drop
regex: "2022"
- sourceLabels: [ __meta_kubernetes_namespace ]
action: replace
targetLabel: kubernetes_namespace
- sourceLabels: [ __meta_kubernetes_pod_name ]
targetLabel: pod
- sourceLabels: [ __meta_kubernetes_namespace,__meta_kubernetes_pod_label_pg_cluster ]
targetLabel: pg_cluster
separator: ':'
replacement: '$1$2'
- sourceLabels: [ __meta_kubernetes_pod_ip ]
targetLabel: ip
replacement: '$1'
- sourceLabels: [ __meta_kubernetes_pod_label_deployment_name ]
targetLabel: deployment
replacement: '$1'
- sourceLabels: [ __meta_kubernetes_pod_label_role ]
targetLabel: role
replacement: '$1'
- sourceLabels: [ dbname ]
targetLabel: dbname
replacement: '$1'
- sourceLabels: [ relname ]
targetLabel: relname
replacement: '$1'
- sourceLabels: [ schemaname ]
targetLabel: schemaname
replacement: '$1'
- targetLabel: exp_type
replacement: 'pg'
namespaceSelector:
matchNames:
- monitoring
selector:
matchLabels:
vendor: crunchydata
Little input if it can help someone. (for the V5.0.1)
First, Thanks @mariusstaicu , I reused your ServiceMonitor.
Second, on my side, it was not totally enough. I got the entry in the service discovery, but nothing on the target side. To fix that, I needed to add a service with the exporter port and the selector aiming for the exporter:
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2021-08-23T04:25:23Z"
labels:
postgres-operator.crunchydata.com/crunchy-postgres-exporter: "true"
name: pgo-exporter
namespace: pgo
spec:
clusterIP: None
selector:
postgres-operator.crunchydata.com/crunchy-postgres-exporter: "true"
ports:
- name: postgres
port: 9187
protocol: TCP
targetPort: 9187
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
Of course, you will need to change the selector.matchlabels
of the service monitor.
It was the last piece to integrate the monitoring to my kube-prometheus stack. bonus: the json service-monitor that you import in your main.jsonnet of kube-prometheus (based on the reply of @mariusstaicu):
{
"apiVersion": "monitoring.coreos.com/v1",
"kind": "ServiceMonitor",
"metadata": {
"name": "postgres-servicemonitor",
"namespace": "monitoring",
"labels": {
"app": "prometheus"
}
},
"spec": {
"endpoints": [
{
"port": "postgres",
"path": "/metrics",
"honorLabels": true,
"interval": "10s",
"relabelings": [
{
"sourceLabels": [
"__meta_kubernetes_pod_label_postgres_operator_crunchydata_com_crunchy_postgres_exporter"
],
"action": "keep",
"regex": "true"
},
{
"sourceLabels": [
"__meta_kubernetes_pod_container_port_number"
],
"action": "drop",
"regex": "5432"
},
{
"sourceLabels": [
"__meta_kubernetes_pod_container_port_number"
],
"action": "drop",
"regex": "^$"
},
{
"sourceLabels": [
"__meta_kubernetes_namespace"
],
"action": "replace",
"targetLabel": "kubernetes_namespace"
},
{
"sourceLabels": [
"__meta_kubernetes_pod_name"
],
"targetLabel": "pod"
},
{
"sourceLabels": [
"__meta_kubernetes_namespace",
"__meta_kubernetes_pod_label_postgres_operator_crunchydata_com_cluster"
],
"targetLabel": "pg_cluster",
"separator": ":",
"replacement": "$1$2"
},
{
"sourceLabels": [
"__meta_kubernetes_pod_ip"
],
"targetLabel": "ip",
"replacement": "$1"
},
{
"sourceLabels": [
"__meta_kubernetes_pod_label_postgres_operator_crunchydata_com_instance"
],
"targetLabel": "deployment",
"replacement": "$1"
},
{
"sourceLabels": [
"__meta_kubernetes_pod_label_postgres_operator_crunchydata_com_role"
],
"targetLabel": "role",
"replacement": "$1"
},
{
"sourceLabels": [
"dbname"
],
"targetLabel": "dbname",
"replacement": "$1"
},
{
"sourceLabels": [
"relname"
],
"targetLabel": "relname",
"replacement": "$1"
},
{
"sourceLabels": [
"schemaname"
],
"targetLabel": "schemaname",
"replacement": "$1"
}
]
}
],
"namespaceSelector": {
"matchNames": [
"pgo"
]
},
"selector": {
"matchLabels": {
"postgres-operator.crunchydata.com/crunchy-postgres-exporter": "true"
}
}
}
}
@jkatz
* Prometheus: https://github.com/CrunchyData/pgmonitor/blob/master/prometheus/crunchy-prometheus.yml.containers * Grafana: https://github.com/CrunchyData/pgmonitor/tree/master/grafana/containers * Alertmanager: https://github.com/CrunchyData/pgmonitor/blob/master/prometheus/alert-rules.d/crunchy-alert-rules-pg.yml.containers.example
We should probably better document how to connect this to one's existing Prometheus / Grafana. If you are able to get this this working and would like to propose a patch (or a workflow that you used) I'd be happy to review it.
Hi there!
I hope it's okay to ask this question here. I do have the same "problem" as the person creating the issue in the first place. We have our own monitoring stack (prometheus, alertmanager, grafana) and I would like to use Crunchy PGO together with our monitoring stack. I already checked out the exporter and confirmed that the metrics are available. So, once a cluster is up with the exporter sidecars running, what needs to be done to connect those apart from setting up a ServiceMonitor which picks up on the exporter? Is this use case supported and just not documented or will have to bend things quite a bit in order to get meaningful metrics etc.?
Thanks in advance!
I don't know if a ServiceMonitor is necessary. This is an update of a PodMonitor spec that works based off of @zposloncec earlier post.
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: crunchy-postgres-exporter
namespace: postgres-operator
spec:
namespaceSelector:
matchNames:
- postgres-operator
selector:
matchLabels:
postgres-operator.crunchydata.com/crunchy-postgres-exporter: "true"
podTargetLabels:
- deployment
- role
- pg_cluster
podMetricsEndpoints:
- port: exporter
path: /metrics
honorLabels: true
interval: 10s
relabelings:
- sourceLabels:
- "__meta_kubernetes_pod_container_port_number"
action: "drop"
regex: "5432"
- sourceLabels:
- "__meta_kubernetes_pod_container_port_number"
action: "drop"
regex: "8009"
- sourceLabels:
- "__meta_kubernetes_pod_container_port_number"
action: "drop"
regex: "2022"
- sourceLabels:
- "__meta_kubernetes_pod_container_port_number"
action: "drop"
regex: "10000"
- sourceLabels:
- "__meta_kubernetes_namespace"
action: "replace"
targetLabel: "kubernetes_namespace"
- sourceLabels:
- "__meta_kubernetes_pod_name"
targetLabel: "pod"
- sourceLabels:
- "__meta_kubernetes_namespace"
- "__meta_kubernetes_pod_label_postgres_operator_crunchydata_com_cluster"
targetLabel: "pg_cluster"
separator: ':'
replacement: '$1$2'
- sourceLabels:
- "__meta_kubernetes_pod_ip"
targetLabel: "ip"
replacement: "$1"
- sourceLabels:
- "__meta_kubernetes_pod_label_postgres_operator_crunchydata_com_instance"
targetLabel: "deployment"
replacement: '$1'
- sourceLabels:
- "__meta_kubernetes_pod_label_postgres_operator_crunchydata_com_role"
targetLabel: "role"
replacement: '$1'
- sourceLabels:
- "dbname"
targetLabel: "dbname"
replacement: "$1"
- sourceLabels:
- "relname"
targetLabel: "relname"
replacement: "$1"
- sourceLabels:
- "schemaname"
targetLabel: "schemaname"
replacement: "$1"
- targetLabel: "exp_type"
replacement: "pg"
This spec assumes that the postgres objects are stored in the prometheus-operator namespace and this has updates to match labels in the latest version of the operator, I believe. I used kustomize/monitoring/prometheus-config.yaml
in the https://github.com/CrunchyData/postgres-operator-examples repo and prometheus/containers/crunchy-prometheus.yml.containers in pgmonitor as a basis for the labels. It appears to me that there's some labels in the prometheus configs that are used as sources for backwards compatibility (i.e. what's deployment_name?). I'm not sure that the podTargetLabels here is necessary as every target label specified in the relabelings appeared in the grafana metrics browser regardless of whether they were specified as a podTargetLabel.
I imported the dashboards in kustomize/monitoring/dashboards
in the examples repo but needed to replace references to the "PROMETHEUS" datasource with "Prometheus" to match the default datasource that kube-prometheus-stack sets up.
You may need to store the PodMonitor in the namespace where your prometheus-stack is installed (i.e. monitoring
instead of prometheus-operator
at line 5) if you don't have target discovery configured to check all namespaces.
I don't know if a ServiceMonitor is necessary. This is an update of a PodMonitor spec that works based off of @zposloncec earlier post.
apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: crunchy-postgres-exporter namespace: postgres-operator spec: namespaceSelector: matchNames: - postgres-operator selector: matchLabels: postgres-operator.crunchydata.com/crunchy-postgres-exporter: "true" podTargetLabels: - deployment - role - pg_cluster podMetricsEndpoints: - port: exporter path: /metrics honorLabels: true interval: 10s relabelings: - sourceLabels: - "__meta_kubernetes_pod_container_port_number" action: "drop" regex: "5432" - sourceLabels: - "__meta_kubernetes_pod_container_port_number" action: "drop" regex: "8009" - sourceLabels: - "__meta_kubernetes_pod_container_port_number" action: "drop" regex: "2022" - sourceLabels: - "__meta_kubernetes_pod_container_port_number" action: "drop" regex: "10000" - sourceLabels: - "__meta_kubernetes_namespace" action: "replace" targetLabel: "kubernetes_namespace" - sourceLabels: - "__meta_kubernetes_pod_name" targetLabel: "pod" - sourceLabels: - "__meta_kubernetes_namespace" - "__meta_kubernetes_pod_label_postgres_operator_crunchydata_com_cluster" targetLabel: "pg_cluster" separator: ':' replacement: '$1$2' - sourceLabels: - "__meta_kubernetes_pod_ip" targetLabel: "ip" replacement: "$1" - sourceLabels: - "__meta_kubernetes_pod_label_postgres_operator_crunchydata_com_instance" targetLabel: "deployment" replacement: '$1' - sourceLabels: - "__meta_kubernetes_pod_label_postgres_operator_crunchydata_com_role" targetLabel: "role" replacement: '$1' - sourceLabels: - "dbname" targetLabel: "dbname" replacement: "$1" - sourceLabels: - "relname" targetLabel: "relname" replacement: "$1" - sourceLabels: - "schemaname" targetLabel: "schemaname" replacement: "$1" - targetLabel: "exp_type" replacement: "pg"
This spec assumes that the postgres objects are stored in the prometheus-operator namespace and this has updates to match labels in the latest version of the operator, I believe. I used
kustomize/monitoring/prometheus-config.yaml
in the https://github.com/CrunchyData/postgres-operator-examples repo and prometheus/containers/crunchy-prometheus.yml.containers in pgmonitor as a basis for the labels. It appears to me that there's some labels in the prometheus configs that are used as sources for backwards compatibility (i.e. what's deployment_name?). I'm not sure that the podTargetLabels here is necessary as every target label specified in the relabelings appeared in the grafana metrics browser regardless of whether they were specified as a podTargetLabel.I imported the dashboards in
kustomize/monitoring/dashboards
in the examples repo but needed to replace references to the "PROMETHEUS" datasource with "Prometheus" to match the default datasource that kube-prometheus-stack sets up.You may need to store the PodMonitor in the namespace where your prometheus-stack is installed (i.e.
monitoring
instead ofprometheus-operator
at line 5) if you don't have target discovery configured to check all namespaces.
Thanks for the snippet @bdols . It works for me, been trying with the service monitor but keep fail to get the target available. Thanks!!
Hi @alrooney,
With the release of PGO v5, we are now re-evaluating various GitHub issues, feature requests, etc. from previous PGO v4 releases. More specifically, in order to properly prioritize the features and fixes needed by our end-users for upcoming PGO v5 releases, we are now attempting to identify which PGO v4 issues, use cases, and enhancement requests are still valid (especially in the context of PGO v5).
Therefore, please let us know if you are still experiencing this issue in PGO v5.
If so, you can either respond to this issue directly to ensure it remains open, or you can close this issue and submit a new one for PGO v5 (this would also be a great opportunity to include any updated details, context, or any other information relevant to your issue). Otherwise, we will be closing this issue in the next 30 days.
If you are still running PGO v4, we recommend that you upgrade to PGO v5 as soon as possible to ensure you have the latest PGO features and functionality.
any news here? i have found this blog post: https://www.crunchydata.com/blog/monitoring-postgresql-clusters-in-kubernetes
but i cannot access the ansible playbooks.
i understand after having deployed the exporter sidecar containers one has to:
- cretae a serviceMonitor for prometheus (in the existing prometheus namespace i assume)
- add a dashboard template in grafana
any idea how to achieve this with PGO v5 and existing installations of prom. + grafana (in different namespaces)?
@dberardo-com That blog post is a little old -- if you're starting fresh now, you probably want to install PGO v5 (5.2.0 is the latest). And for that, we have a tutorial about enabling metrics using our metric stack (I know you want to use your own, that's fine, but let's just start this with documentation):
docs: https://access.crunchydata.com/documentation/postgres-operator/v5/tutorial/monitoring/ monitoring yaml (for reference): https://github.com/CrunchyData/postgres-operator-examples/tree/main/kustomize/monitoring
The usual way this would work is that you would start a pg cluster in a namespace and then start this monitoring stack in the same namespace, which does two things for us:
- the prometheus is set up to discover the metric endpoint in the sidecar;
- and all of the dashboards in the dashboards folder get merged into a configmap, which then gets mounted in the grafana pod here, which just takes advantage of how Grafana does things (see here).
OK, so how do we use that info to set this up?
(a) add exporter sidecar to the pg cluster, as you noted; (b) can create those dashboards in grafana by either adding them as a configmap which then gets mounted to the grafana pod at the path where Grafana expects dashboards (or where you have it set to expect dashboards) -- or you could just add those json dashboards by hand. (Note: if you do that, I hope your grafana is backed by something more than an ephemeral filesystem, because if the pod dies, then all your hand-loaded dashboards go away and you have to do that again. Which is why I like doing it as a configmap. But I don't know your system, so not sure what's easier. ); (c) and create some service discovery mechanism whereby prometheus in namespace X can get metrics from pg cluster in namespace Y (assuming you don't have namespaces locked down).
That part (c) is covered above in the discussion of serviceMonitor and podMonitor. (I recommend podMonitor so you don't need to create a Service.) I think the answers above should give a good headstart on that, but the Prometheus docs here are also pretty helpful.
Hi, Could you please advise how did you fix the issue with not working dashboards? I'm able to get metrics using helm installation and instead of whole installation. I just enable metrics and needed image in values.yml and afterwards I created PodMonitor to scrap the data but the issue which I encountered is that I don't have metrics like __meta_kubernetes_pod_label_postgres_operator_crunchydata_com_role so I can't make it work on dashboard properly.
Thanks for advice!