prometheus-operator icon indicating copy to clipboard operation
prometheus-operator copied to clipboard

TelegramConfigs not working

Open gmendesnip opened this issue 3 years ago • 15 comments

What happened? Setting up notification channels and policies for Grafana, I was following the documentation from https://prometheus.io/docs/alerting/latest/configuration/#telegram_config and https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md for receivers.

However, while applying it for slack and telegram alerts, slack alerts were set up fine while telegram alerts were not. They are provisioned through terraform and the changes are applied, but when I log into Grafana console, it is not shown the Telegram notification channel.

I've set up every attribute exactly as the documentation shows.

Did you expect to see something different?

Yes, I expected the notification channel provisioned on the Grafana console.

How to reproduce it (as minimally and precisely as possible): Follow the alert manager configuration down below

apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
  name: alertmanager-config
  namespace: monitoring
  labels:
    alertmanagerConfig: alertmanager-config
spec:
  route:
    receiver: "null"
    groupBy: ['instance', 'alertname']
    groupWait: 45s
    groupInterval: 10m
    repeatInterval: 2h
    routes:
      - receiver: "slack-infra"
        continue: false
        matchers:
          - name: group
            value: infra
      - receiver: "slack-apps"
        continue: false
        matchers:
          - name: group
            value: apps
      - receiver: "slack-sniptech"
        continue: false
        matchers:
          - name: group
            value: sniptech
      - receiver: "telegram-sniptech"
        continue: false
        matchers:
          - name: group
            value: sniptech
  receivers:
    - name: 'null'
    - name: 'slack-infra'
      slackConfigs:
        - channel: '#alerts'
          sendResolved: true
          text: "<!channel> \nsummary: {{ .CommonAnnotations.summary }}\ndescription: {{ .CommonAnnotations.description }}"
          apiURL:
            key: 'apiURL'
            name: 'slack-config'
    - name: 'slack-apps'
      slackConfigs:
        - channel: '#alerts-apps'
          sendResolved: true
          text: "<!channel> \nsummary: {{ .CommonAnnotations.summary }}\ndescription: {{ .CommonAnnotations.description }}"
          apiURL:
            key: 'apiURL'
            name: 'slack-apps-config'
    - name: 'telegram-sniptech'
      telegramConfigs:
        - botToken:
            key: 'apiToken'
            name: 'telegram-sniptech-config'
          chatID: -794781609
          message: "summary: {{ .CommonAnnotations.summary }}\ndescription: {{ .CommonAnnotations.description }}"
          sendResolved: true
    - name: 'slack-sniptech'
      slackConfigs:
        - channel: '#sniptech-alerts'
          sendResolved: true
          text: "<!channel> \nsummary: {{ .CommonAnnotations.summary }}\ndescription: {{ .CommonAnnotations.description }}"
          apiURL:
            key: 'apiURL'
            name: 'slack-sniptech-config'

Environment

  • Prometheus Operator version:

Name: app-cluster-prometheus-operator Namespace: monitoring CreationTimestamp: Mon, 22 Aug 2022 16:57:44 +0200 Labels: app.kubernetes.io/component=operator app.kubernetes.io/instance=app-cluster-prometheus app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=kube-prometheus helm.sh/chart=kube-prometheus-6.10.3 Annotations: deployment.kubernetes.io/revision: 1 meta.helm.sh/release-name: app-cluster-prometheus meta.helm.sh/release-namespace: monitoring Selector: app.kubernetes.io/component=operator,app.kubernetes.io/instance=app-cluster-prometheus,app.kubernetes.io/name=kube-prometheus Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable StrategyType: RollingUpdate MinReadySeconds: 0 RollingUpdateStrategy: 25% max unavailable, 25% max surge Pod Template: Labels: app.kubernetes.io/component=operator app.kubernetes.io/instance=app-cluster-prometheus app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=kube-prometheus helm.sh/chart=kube-prometheus-6.10.3 Service Account: app-cluster-prometheus-operator Containers: prometheus-operator: Image: docker.io/bitnami/prometheus-operator:0.56.2-debian-10-r0 Port: 8080/TCP Host Port: 0/TCP Args: --kubelet-service=kube-system/app-cluster-prometheus-kubelet --log-format=logfmt --log-level=info --localhost=127.0.0.1 --prometheus-config-reloader=$(PROMETHEUS_CONFIG_RELOADER) Liveness: http-get http://:http/metrics delay=120s timeout=5s period=10s #success=1 #failure=6 Readiness: http-get http://:http/metrics delay=30s timeout=5s period=10s #success=1 #failure=6 Environment: PROMETHEUS_CONFIG_RELOADER: <set to the key 'prometheus-config-reloader' of config map 'app-cluster-prometheus-operator'> Optional: false Mounts: Volumes: Conditions: Type Status Reason


Progressing True NewReplicaSetAvailable Available True MinimumReplicasAvailable OldReplicaSets: NewReplicaSet: app-cluster-prometheus-operator-5d46c4bb98 (1/1 replicas created) Events:

  • Kubernetes version information:

Client Version: v1.24.2 Kustomize Version: v4.5.4 Server Version: v1.21.14-gke.700

  • Kubernetes cluster kind:

Using terraform resource helm_release

resource "helm_release" "grafana-operator" { name = "grafana-operator" repository = "https://charts.bitnami.com/bitnami" chart = "grafana-operator" namespace = kubernetes_namespace.monitoring.metadata[0].name create_namespace = false version = "2.5.2"

set { name = "grafana.enabled" value = "false" }

}

  • Manifests:
insert manifests relevant to the issue
  • Prometheus Operator Logs:
insert Prometheus Operator logs relevant to the issue here

Anything else we need to know?:

gmendesnip avatar Aug 30 '22 14:08 gmendesnip

Hello, @gmendesnip I don't have much experience with this, but you actually have a - in the chatID couldn't be this?

Isn't there anything showing in the Alertmanager logs?

JoaoBraveCoding avatar Aug 30 '22 17:08 JoaoBraveCoding

Unfortunately, nothing on the logs. The chatID actually is with - (that is on the URL). I tried without it and also didn't work.

ghost avatar Aug 31 '22 06:08 ghost

@gmendesnip Could you please let us know what version of prometheus-operator version you are using? Telegram support was added in 0.57 only

slashpai avatar Sep 01 '22 06:09 slashpai

It is on version v.0.57.0

ghost avatar Sep 01 '22 06:09 ghost

From issue description I see Image: docker.io/bitnami/prometheus-operator:0.56.2-debian-10-r0 that looks to be different Could you please check version is right from prometheus-operator deployment?

slashpai avatar Sep 01 '22 07:09 slashpai

I guess the docker image version is set by the kube-prometheus version. I'm searching the documentation and can't find any specifications for the kube-prometheus version I must set in order to have prometheus-operator 0.57 version.

My version is set to 6.10.3. Which version do you suggest to use so I can have telegram support?

`resource "helm_release" "bitnami-prometheus" { depends_on = [kubernetes_namespace.monitoring]

name = "app-cluster-prometheus"

repository = "https://charts.bitnami.com/bitnami" chart = "kube-prometheus"

namespace = "monitoring" create_namespace = false version = "6.10.3" `

ghost avatar Sep 01 '22 07:09 ghost

Also, for Grafana Operator is set to version 2.5.2

resource "helm_release" "grafana-operator" { name = "grafana-operator" repository = "https://charts.bitnami.com/bitnami" chart = "grafana-operator" namespace = kubernetes_namespace.monitoring.metadata[0].name create_namespace = false version = "2.5.2"

ghost avatar Sep 01 '22 07:09 ghost

I am not familiar with bitnami helm charts but looks like chart is using 0.56 version of prometheus operator from description in issue. So if chart deploys 0.57 or above it should have telegram support. Can you ask that charts repo how to use that? You would also to need to update alertmanager CRD to get the feature

slashpai avatar Sep 01 '22 08:09 slashpai

How do I update the Alertmanager CRD?

ghost avatar Sep 01 '22 08:09 ghost

sorry I meant alertmanager config crd

I see that is packaged as part of chart https://github.com/bitnami/charts/blob/master/bitnami/kube-prometheus/crds/crd-alertmanager-config.yaml. So that already has telegram fields. And latest chart uses https://github.com/bitnami/charts/blob/master/bitnami/kube-prometheus/values.yaml#L67 0.58 version. So IIUC you would need to use newer version of chart for getting new prometheus operator version

slashpai avatar Sep 01 '22 08:09 slashpai

I've updated to 0.58 and I still got the same problem:

`Name: instr-cluster-prometheus-operator Namespace: monitoring CreationTimestamp: Mon, 08 Aug 2022 14:57:52 +0200 Labels: app=kube-prometheus-stack-operator app.kubernetes.io/instance=instr-cluster-prometheus app.kubernetes.io/managed-by=Helm app.kubernetes.io/part-of=kube-prometheus-stack app.kubernetes.io/version=39.9.0 chart=kube-prometheus-stack-39.9.0 heritage=Helm release=instr-cluster-prometheus Annotations: deployment.kubernetes.io/revision: 3 meta.helm.sh/release-name: instr-cluster-prometheus meta.helm.sh/release-namespace: monitoring Selector: app=kube-prometheus-stack-operator,release=instr-cluster-prometheus Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable StrategyType: RollingUpdate MinReadySeconds: 0 RollingUpdateStrategy: 25% max unavailable, 25% max surge Pod Template: Labels: app=kube-prometheus-stack-operator app.kubernetes.io/instance=instr-cluster-prometheus app.kubernetes.io/managed-by=Helm app.kubernetes.io/part-of=kube-prometheus-stack app.kubernetes.io/version=39.9.0 chart=kube-prometheus-stack-39.9.0 heritage=Helm release=instr-cluster-prometheus Annotations: kubectl.kubernetes.io/restartedAt: 2022-09-01T10:55:21+02:00 Service Account: instr-cluster-prometheus-operator Containers: kube-prometheus-stack: Image: quay.io/prometheus-operator/prometheus-operator:v0.58.0 Port: 10250/TCP Host Port: 0/TCP Args: --kubelet-service=kube-system/instr-cluster-prometheus-kubelet --localhost=127.0.0.1 --prometheus-config-reloader=quay.io/prometheus-operator/prometheus-config-reloader:v0.58.0 --config-reloader-cpu-request=200m --config-reloader-cpu-limit=200m --config-reloader-memory-request=50Mi --config-reloader-memory-limit=50Mi --thanos-default-base-image=quay.io/thanos/thanos:v0.27.0 --web.enable-tls=true --web.cert-file=/cert/cert --web.key-file=/cert/key --web.listen-address=:10250 --web.tls-min-version=VersionTLS13 Environment: Mounts: /cert from tls-secret (ro) Volumes: tls-secret: Type: Secret (a volume populated by a Secret) SecretName: instr-cluster-prometheus-admission Optional: false Conditions: Type Status Reason


Available True MinimumReplicasAvailable Progressing True NewReplicaSetAvailable OldReplicaSets: NewReplicaSet: instr-cluster-prometheus-operator-d64655fb8 (1/1 replicas created) Events: Type Reason Age From Message


Normal ScalingReplicaSet 18m deployment-controller Scaled up replica set instr-cluster-prometheus-operator-6f676c566 to 1 Normal ScalingReplicaSet 18m deployment-controller Scaled down replica set instr-cluster-prometheus-operator-7ffdfd79c6 to 0 Normal ScalingReplicaSet 4m34s deployment-controller Scaled up replica set instr-cluster-prometheus-operator-d64655fb8 to 1 Normal ScalingReplicaSet 4m30s deployment-controller Scaled down replica set instr-cluster-prometheus-operator-6f676c566 to 0`

ghost avatar Sep 01 '22 09:09 ghost

Can you check prometheus-operator logs?

slashpai avatar Sep 01 '22 12:09 slashpai

There is not much on the logs. It repeats the same thing. For telegramConfigs, the warning log is the same for the slackConfigs, but slack is working, telegram is not:

level=info ts=2022-09-01T09:20:48.79339231Z caller=operator.go:750 component=alertmanageroperator key=monitoring/alertmanager msg="sync alertmanager"
level=info ts=2022-09-01T09:20:48.805614481Z caller=operator.go:881 component=alertmanageroperator alertmanager=alertmanager namespace=monitoring msg="config secret not found, using default Alertmanager configuration" secret=alertmanager-alertmanager
level=warn ts=2022-09-01T09:20:48.80610841Z caller=amcfg.go:1609 component=alertmanageroperator alertmanager=alertmanager namespace=monitoring receiver=monitoring/alertmanager-config/slack-infra msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"
level=warn ts=2022-09-01T09:20:48.806310376Z caller=amcfg.go:1609 component=alertmanageroperator alertmanager=alertmanager namespace=monitoring receiver=monitoring/alertmanager-config/slack-apps msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"
level=warn ts=2022-09-01T09:20:48.806491653Z caller=amcfg.go:1609 component=alertmanageroperator alertmanager=alertmanager namespace=monitoring receiver=monitoring/alertmanager-config/slack-sniptech msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"
level=warn ts=2022-09-01T09:20:48.806625005Z caller=amcfg.go:1609 component=alertmanageroperator alertmanager=alertmanager namespace=monitoring receiver=monitoring/alertmanager-config/telegram-sniptech msg="'matchers' field is using a deprecated syntax which will be removed in future versions" match="unsupported value type" match_re="unsupported value type"
level=warn ts=2022-09-01T09:20:49.040016693Z caller=promcfg.go:813 component=prometheusoperator msg="'targetPort' is deprecated, use 'port' instead." version=v2.36.1
level=info ts=2022-09-01T09:20:49.076917757Z caller=operator.go:1389 component=prometheusoperator key=monitoring/instr-cluster-prometheus-prometheus msg="sync prometheus"
level=info ts=2022-09-01T09:20:49.077171142Z caller=operator.go:1558 component=prometheusoperator key=monitoring/instr-cluster-prometheus-prometheus msg="update prometheus status"
level=warn ts=2022-09-01T09:20:49.436212582Z caller=promcfg.go:813 component=prometheusoperator msg="'targetPort' is deprecated, use 'port' instead." version=v2.36.1
level=info ts=2022-09-01T09:20:49.472512653Z caller=operator.go:1558 component=prometheusoperator key=monitoring/instr-cluster-prometheus-prometheus msg="update prometheus status"

ghost avatar Sep 01 '22 12:09 ghost

ok configuration looks good to me. Can you also check the alertmanager logs as well?

slashpai avatar Sep 01 '22 14:09 slashpai

The only thing appearing after deploying on Alertmanager logs is: ts=2022-09-02T07:07:08.101Z caller=dispatch.go:352 level=error component=dispatcher msg="Notify for alerts failed" num_alerts=4 err="monitoring/alertmanager-config/slack-infra/slack[0]: notify retry canceled due to unrecoverable error after 1 attempts: channel \"#alerts\": unexpected status code 404: channel_not_found" ts=2022-09-02T07:07:14.799Z caller=coordinator.go:113 level=info component=configuration msg="Loading configuration file" file=/etc/alertmanager/config/alertmanager.yaml ts=2022-09-02T07:07:14.801Z caller=coordinator.go:126 level=info component=configuration msg="Completed loading of configuration file" file=/etc/alertmanager/config/alertmanager.yaml

ghost avatar Sep 02 '22 07:09 ghost

This issue has been automatically marked as stale because it has not had any activity in the last 60 days. Thank you for your contributions.

github-actions[bot] avatar Nov 02 '22 02:11 github-actions[bot]

This issue was closed because it has not had any activity in the last 120 days. Please reopen if you feel this is still valid.

github-actions[bot] avatar Mar 03 '23 02:03 github-actions[bot]