prometheus-operator
prometheus-operator copied to clipboard
TelegramConfigs not working
What happened?
Setting up notification channels and policies for Grafana, I was following the documentation from https://prometheus.io/docs/alerting/latest/configuration/#telegram_config and https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md for receivers.
However, while applying it for slack and telegram alerts, slack alerts were set up fine while telegram alerts were not. They are provisioned through terraform and the changes are applied, but when I log into Grafana console, it is not shown the Telegram notification channel.
I've set up every attribute exactly as the documentation shows.
Did you expect to see something different?
Yes, I expected the notification channel provisioned on the Grafana console.
How to reproduce it (as minimally and precisely as possible): Follow the alert manager configuration down below
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
name: alertmanager-config
namespace: monitoring
labels:
alertmanagerConfig: alertmanager-config
spec:
route:
receiver: "null"
groupBy: ['instance', 'alertname']
groupWait: 45s
groupInterval: 10m
repeatInterval: 2h
routes:
- receiver: "slack-infra"
continue: false
matchers:
- name: group
value: infra
- receiver: "slack-apps"
continue: false
matchers:
- name: group
value: apps
- receiver: "slack-sniptech"
continue: false
matchers:
- name: group
value: sniptech
- receiver: "telegram-sniptech"
continue: false
matchers:
- name: group
value: sniptech
receivers:
- name: 'null'
- name: 'slack-infra'
slackConfigs:
- channel: '#alerts'
sendResolved: true
text: "<!channel> \nsummary: {{ .CommonAnnotations.summary }}\ndescription: {{ .CommonAnnotations.description }}"
apiURL:
key: 'apiURL'
name: 'slack-config'
- name: 'slack-apps'
slackConfigs:
- channel: '#alerts-apps'
sendResolved: true
text: "<!channel> \nsummary: {{ .CommonAnnotations.summary }}\ndescription: {{ .CommonAnnotations.description }}"
apiURL:
key: 'apiURL'
name: 'slack-apps-config'
- name: 'telegram-sniptech'
telegramConfigs:
- botToken:
key: 'apiToken'
name: 'telegram-sniptech-config'
chatID: -794781609
message: "summary: {{ .CommonAnnotations.summary }}\ndescription: {{ .CommonAnnotations.description }}"
sendResolved: true
- name: 'slack-sniptech'
slackConfigs:
- channel: '#sniptech-alerts'
sendResolved: true
text: "<!channel> \nsummary: {{ .CommonAnnotations.summary }}\ndescription: {{ .CommonAnnotations.description }}"
apiURL:
key: 'apiURL'
name: 'slack-sniptech-config'
Environment
- Prometheus Operator version:
Name: app-cluster-prometheus-operator
Namespace: monitoring
CreationTimestamp: Mon, 22 Aug 2022 16:57:44 +0200
Labels: app.kubernetes.io/component=operator
app.kubernetes.io/instance=app-cluster-prometheus
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=kube-prometheus
helm.sh/chart=kube-prometheus-6.10.3
Annotations: deployment.kubernetes.io/revision: 1
meta.helm.sh/release-name: app-cluster-prometheus
meta.helm.sh/release-namespace: monitoring
Selector: app.kubernetes.io/component=operator,app.kubernetes.io/instance=app-cluster-prometheus,app.kubernetes.io/name=kube-prometheus
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app.kubernetes.io/component=operator
app.kubernetes.io/instance=app-cluster-prometheus
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=kube-prometheus
helm.sh/chart=kube-prometheus-6.10.3
Service Account: app-cluster-prometheus-operator
Containers:
prometheus-operator:
Image: docker.io/bitnami/prometheus-operator:0.56.2-debian-10-r0
Port: 8080/TCP
Host Port: 0/TCP
Args:
--kubelet-service=kube-system/app-cluster-prometheus-kubelet
--log-format=logfmt
--log-level=info
--localhost=127.0.0.1
--prometheus-config-reloader=$(PROMETHEUS_CONFIG_RELOADER)
Liveness: http-get http://:http/metrics delay=120s timeout=5s period=10s #success=1 #failure=6
Readiness: http-get http://:http/metrics delay=30s timeout=5s period=10s #success=1 #failure=6
Environment:
PROMETHEUS_CONFIG_RELOADER: <set to the key 'prometheus-config-reloader' of config map 'app-cluster-prometheus-operator'> Optional: false
Mounts:
Progressing True NewReplicaSetAvailable
Available True MinimumReplicasAvailable
OldReplicaSets:
- Kubernetes version information:
Client Version: v1.24.2 Kustomize Version: v4.5.4 Server Version: v1.21.14-gke.700
- Kubernetes cluster kind:
Using terraform resource helm_release
resource "helm_release" "grafana-operator" { name = "grafana-operator" repository = "https://charts.bitnami.com/bitnami" chart = "grafana-operator" namespace = kubernetes_namespace.monitoring.metadata[0].name create_namespace = false version = "2.5.2"
set { name = "grafana.enabled" value = "false" }
}
- Manifests:
insert manifests relevant to the issue
- Prometheus Operator Logs:
insert Prometheus Operator logs relevant to the issue here
Anything else we need to know?:
Hello, @gmendesnip I don't have much experience with this, but you actually have a - in the chatID couldn't be this?
Isn't there anything showing in the Alertmanager logs?
Unfortunately, nothing on the logs. The chatID actually is with - (that is on the URL). I tried without it and also didn't work.
@gmendesnip Could you please let us know what version of prometheus-operator version you are using?
Telegram support was added in 0.57 only
It is on version v.0.57.0
From issue description I see Image: docker.io/bitnami/prometheus-operator:0.56.2-debian-10-r0 that looks to be different
Could you please check version is right from prometheus-operator deployment?
I guess the docker image version is set by the kube-prometheus version. I'm searching the documentation and can't find any specifications for the kube-prometheus version I must set in order to have prometheus-operator 0.57 version.
My version is set to 6.10.3. Which version do you suggest to use so I can have telegram support?
`resource "helm_release" "bitnami-prometheus" { depends_on = [kubernetes_namespace.monitoring]
name = "app-cluster-prometheus"
repository = "https://charts.bitnami.com/bitnami" chart = "kube-prometheus"
namespace = "monitoring" create_namespace = false version = "6.10.3" `
Also, for Grafana Operator is set to version 2.5.2
resource "helm_release" "grafana-operator" { name = "grafana-operator" repository = "https://charts.bitnami.com/bitnami" chart = "grafana-operator" namespace = kubernetes_namespace.monitoring.metadata[0].name create_namespace = false version = "2.5.2"
I am not familiar with bitnami helm charts but looks like chart is using 0.56 version of prometheus operator from description in issue. So if chart deploys 0.57 or above it should have telegram support. Can you ask that charts repo how to use that? You would also to need to update alertmanager CRD to get the feature
How do I update the Alertmanager CRD?
sorry I meant alertmanager config crd
I see that is packaged as part of chart https://github.com/bitnami/charts/blob/master/bitnami/kube-prometheus/crds/crd-alertmanager-config.yaml. So that already has telegram fields. And latest chart uses https://github.com/bitnami/charts/blob/master/bitnami/kube-prometheus/values.yaml#L67 0.58 version. So IIUC you would need to use newer version of chart for getting new prometheus operator version
I've updated to 0.58 and I still got the same problem:
`Name: instr-cluster-prometheus-operator
Namespace: monitoring
CreationTimestamp: Mon, 08 Aug 2022 14:57:52 +0200
Labels: app=kube-prometheus-stack-operator
app.kubernetes.io/instance=instr-cluster-prometheus
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/part-of=kube-prometheus-stack
app.kubernetes.io/version=39.9.0
chart=kube-prometheus-stack-39.9.0
heritage=Helm
release=instr-cluster-prometheus
Annotations: deployment.kubernetes.io/revision: 3
meta.helm.sh/release-name: instr-cluster-prometheus
meta.helm.sh/release-namespace: monitoring
Selector: app=kube-prometheus-stack-operator,release=instr-cluster-prometheus
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=kube-prometheus-stack-operator
app.kubernetes.io/instance=instr-cluster-prometheus
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/part-of=kube-prometheus-stack
app.kubernetes.io/version=39.9.0
chart=kube-prometheus-stack-39.9.0
heritage=Helm
release=instr-cluster-prometheus
Annotations: kubectl.kubernetes.io/restartedAt: 2022-09-01T10:55:21+02:00
Service Account: instr-cluster-prometheus-operator
Containers:
kube-prometheus-stack:
Image: quay.io/prometheus-operator/prometheus-operator:v0.58.0
Port: 10250/TCP
Host Port: 0/TCP
Args:
--kubelet-service=kube-system/instr-cluster-prometheus-kubelet
--localhost=127.0.0.1
--prometheus-config-reloader=quay.io/prometheus-operator/prometheus-config-reloader:v0.58.0
--config-reloader-cpu-request=200m
--config-reloader-cpu-limit=200m
--config-reloader-memory-request=50Mi
--config-reloader-memory-limit=50Mi
--thanos-default-base-image=quay.io/thanos/thanos:v0.27.0
--web.enable-tls=true
--web.cert-file=/cert/cert
--web.key-file=/cert/key
--web.listen-address=:10250
--web.tls-min-version=VersionTLS13
Environment:
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets:
Normal ScalingReplicaSet 18m deployment-controller Scaled up replica set instr-cluster-prometheus-operator-6f676c566 to 1 Normal ScalingReplicaSet 18m deployment-controller Scaled down replica set instr-cluster-prometheus-operator-7ffdfd79c6 to 0 Normal ScalingReplicaSet 4m34s deployment-controller Scaled up replica set instr-cluster-prometheus-operator-d64655fb8 to 1 Normal ScalingReplicaSet 4m30s deployment-controller Scaled down replica set instr-cluster-prometheus-operator-6f676c566 to 0`
Can you check prometheus-operator logs?
There is not much on the logs. It repeats the same thing. For telegramConfigs, the warning log is the same for the slackConfigs, but slack is working, telegram is not:
level=info ts=2022-09-01T09:20:48.79339231Z caller=operator.go:750 component=alertmanageroperator key=monitoring/alertmanager msg="sync alertmanager"
level=info ts=2022-09-01T09:20:48.805614481Z caller=operator.go:881 component=alertmanageroperator alertmanager=alertmanager namespace=monitoring msg="config secret not found, using default Alertmanager configuration" secret=alertmanager-alertmanager
level=warn ts=2022-09-01T09:20:48.80610841Z caller=amcfg.go:1609 component=alertmanageroperator alertmanager=alertmanager namespace=monitoring receiver=monitoring/alertmanager-config/slack-infra msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"
level=warn ts=2022-09-01T09:20:48.806310376Z caller=amcfg.go:1609 component=alertmanageroperator alertmanager=alertmanager namespace=monitoring receiver=monitoring/alertmanager-config/slack-apps msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"
level=warn ts=2022-09-01T09:20:48.806491653Z caller=amcfg.go:1609 component=alertmanageroperator alertmanager=alertmanager namespace=monitoring receiver=monitoring/alertmanager-config/slack-sniptech msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"
level=warn ts=2022-09-01T09:20:48.806625005Z caller=amcfg.go:1609 component=alertmanageroperator alertmanager=alertmanager namespace=monitoring receiver=monitoring/alertmanager-config/telegram-sniptech msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"
level=warn ts=2022-09-01T09:20:49.040016693Z caller=promcfg.go:813 component=prometheusoperator msg="'targetPort' is deprecated, use 'port' instead." version=v2.36.1
level=info ts=2022-09-01T09:20:49.076917757Z caller=operator.go:1389 component=prometheusoperator key=monitoring/instr-cluster-prometheus-prometheus msg="sync prometheus"
level=info ts=2022-09-01T09:20:49.077171142Z caller=operator.go:1558 component=prometheusoperator key=monitoring/instr-cluster-prometheus-prometheus msg="update prometheus status"
level=warn ts=2022-09-01T09:20:49.436212582Z caller=promcfg.go:813 component=prometheusoperator msg="'targetPort' is deprecated, use 'port' instead." version=v2.36.1
level=info ts=2022-09-01T09:20:49.472512653Z caller=operator.go:1558 component=prometheusoperator key=monitoring/instr-cluster-prometheus-prometheus msg="update prometheus status"
ok configuration looks good to me. Can you also check the alertmanager logs as well?
The only thing appearing after deploying on Alertmanager logs is:
ts=2022-09-02T07:07:08.101Z caller=dispatch.go:352 level=error component=dispatcher msg="Notify for alerts failed" num_alerts=4 err="monitoring/alertmanager-config/slack-infra/slack[0]: notify retry canceled due to unrecoverable error after 1 attempts: channel \"#alerts\": unexpected status code 404: channel_not_found" ts=2022-09-02T07:07:14.799Z caller=coordinator.go:113 level=info component=configuration msg="Loading configuration file" file=/etc/alertmanager/config/alertmanager.yaml ts=2022-09-02T07:07:14.801Z caller=coordinator.go:126 level=info component=configuration msg="Completed loading of configuration file" file=/etc/alertmanager/config/alertmanager.yaml
This issue has been automatically marked as stale because it has not had any activity in the last 60 days. Thank you for your contributions.
This issue was closed because it has not had any activity in the last 120 days. Please reopen if you feel this is still valid.