alertmanager icon indicating copy to clipboard operation
alertmanager copied to clipboard

Alertmanager Fails to Start with Configuration Error for Telegram Integration

Open alemsas opened this issue 11 months ago • 1 comments

Description

I've set up Prometheus with Alertmanager to send alerts to a Telegram chat. However, Alertmanager fails to start, reporting a configuration error related to the chat_id and text fields in the Telegram configuration. Environment

Prometheus version: v2.30.3
Alertmanager version: v0.26.0
Docker version: (include your docker version here)
Operating System: (include your OS here)

Configuration

`` I have the following setup in my docker-compose.yml:

version: '3'
services:
  prometheus:
    image: prom/prometheus:v2.30.3
    container_name: prometheus
    volumes:
      - $PWD/prometheus:/etc/prometheus
      - prometheus_data:/prometheus
      - $PWD/alertmanager:/etc/alertmanager
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
    ports:
      - 9091:9090
    restart: always

  alertmanager:
    image: prom/alertmanager:v0.26.0
    container_name: alertmanager
    volumes:
      - ./alertmanager:/etc/alertmanager
    command:
      - '--config.file=/etc/alertmanager/config.yml'
      - '--storage.path=/alertmanager'
    ports:
      - 9093:9093
    restart: always
  elasticsearch-exporter-1:
      image: quay.io/prometheuscommunity/elasticsearch-exporter:latest
      command:
        - '--es.uri=https://elastic:[email protected]:9202'
        - '--es.all'
        - '--es.ssl-skip-verify'
      ports:
        - "9114:9114"
      restart: always
  elasticsearch-exporter-2:
      image: quay.io/prometheuscommunity/elasticsearch-exporter:latest
      command:
        - '--es.uri=https://elastic:[email protected]:9201'
        - '--es.all'
        - '--es.ssl-skip-verify'
      ports:
        - "9115:9114"
      restart: always
volumes:
  prometheus_data:

And my alertmanager/config.yml is configured as follows :

global:
  resolve_timeout: 1m

route:
  receiver: 'telegram'
  group_by: ['alertname', 'cluster']
  repeat_interval: 1h

receivers:
- name: 'telegram'
  telegram_configs:
  - send_resolved: true
    api_url: 'https://api.telegram.org/bot<TOKEN>/sendMessage'
    chat_id: '-100xxxxxx'
    text: |
      {{ range .Alerts }}
        *Alert:* {{ .Annotations.summary }}{{ if .Labels.severity }} - `{{ .Labels.severity }}`{{ end }}
        *Description:* {{ .Annotations.description }}
        *Details:*
        {{ range .Labels.SortedPairs }}{{ .Name }}: {{ .Value }}
        {{ end }}
      {{ end }}

(Note: <TOKEN> and <CHAT_ID> are placeholders for the actual bot token and chat ID.) Error

Upon starting Alertmanager, it fails with the following error:

yaml: unmarshal errors:
  line 14: cannot unmarshal !!str `-100169...` into int64
  line 15: field text not found in type config.plain

Here is whole docker container logs for alertmanager :

ts=2024-03-12T12:13:34.202Z caller=main.go:246 level=info build_context="(go=go1.20.7, platform=linux/amd64, user=root@df8d7debeef4, date=20230824-11:11:58, tags=netgo)"
ts=2024-03-12T12:13:34.203Z caller=cluster.go:186 level=info component=cluster msg="setting advertise address explicitly" addr=192.168.48.5 port=9094
ts=2024-03-12T12:13:34.205Z caller=cluster.go:683 level=info component=cluster msg="Waiting for gossip to settle..." interval=2s
ts=2024-03-12T12:13:34.262Z caller=coordinator.go:113 level=info component=configuration msg="Loading configuration file" file=/etc/alertmanager/config.yml
ts=2024-03-12T12:13:34.262Z caller=coordinator.go:118 level=error component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config.yml err="yaml: unmarshal errors:\n  line 14: cannot unmarshal !!str `-100169...` into int64\n  line 15: field text not found in type config.plain"
ts=2024-03-12T12:13:34.262Z caller=cluster.go:692 level=info component=cluster msg="gossip not settled but continuing anyway" polls=0 elapsed=57.276658ms

prometheus/alertmanager.rules

groups:
- name: elasticsearch_alerts
  rules:
  - alert: LowElasticsearchNodes
    expr: elasticsearch_cluster_health_number_of_nodes{job="your_job_name", instance=~"your_instance_regex", cluster="your_cluster_name"} < 3
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "Low number of Elasticsearch nodes ({{ $labels.cluster }})"
      description: "Elasticsearch cluster '{{ $labels.cluster }}' has fewer than 3 nodes for more than 5 minutes."

prometheus/prometheus.yml

global:
  scrape_interval: 15s

rule_files:
  - "/root/Prometheus/prometheus/lertmanager.rules"

scrape_configs:
  - job_name: 'elasticsearch-1'
    metrics_path: '/metrics'
    static_configs:
      - targets: ['172.31.63.2:9114']
  - job_name: 'elasticsearch-2'
    metrics_path: '/metrics'
    static_configs:
      - targets: ['172.31.63.2:9115']
tree .
.
├── alertmanager
│   └── config.yml
├── docker-compose.yml
└── prometheus
    ├── alertmanager.rules
    └── prometheus.yml

It appears there's an issue with parsing the chat_id (which is a negative number for Telegram groups) and recognizing the text field in the Telegram configuration. Steps to Reproduce

Set up Prometheus and Alertmanager with the above configurations.
Start the services using Docker Compose.
Observe the logs of the Alertmanager container.

Expected Behavior

Alertmanager starts successfully and is able to send alerts to the configured Telegram chat. Actual Behavior

Alertmanager fails to start due to a configuration parsing error related to the Telegram integration.

alemsas avatar Mar 12 '24 12:03 alemsas

By using chat_id: '-100xxxxxx' (single quotes) you are making this a string explicitly

TheMeier avatar Mar 12 '24 20:03 TheMeier