cortex
cortex copied to clipboard
Ruler is unable to send notifications to the alertmanager cluster setup
Hi Team,
I configured the alertmanager cluster as per the docs (with a replica of 2) and configured ruler to point to the alertmanager headless service. But I am getting EOF on ruler and the alertmanager debug logs report invalid msgType.
Ruler Config
alertmanager_url: http://_cluster._tcp.cortex-aggregator-v2-alertmanager-headless/api/prom/alertmanager
enable_alertmanager_discovery: true
alertmanager_refresh_interval: 1m0s
enable_alertmanager_v2: false
notification_queue_capacity: 10000
notification_timeout: 10s
AM config
external_url: /api/prom/alertmanager
cluster:
listen_address: 0.0.0.0:9094
advertise_address: ""
peers: cortex-aggregator-v2-alertmanager-headless.cortex-2.svc.cluster.local:9094
peer_timeout: 15s
gossip_interval: 200ms
push_pull_interval: 1m0s
Ruler Logs
level=error ts=2022-09-13T16:37:20.663333971Z caller=notifier.go:527 user=something-else-for-testing alertmanager=http://cortex-aggregator-v2-alertmanager-0.cortex-aggregator-v2-alertmanager-headless.cortex-2.svc.cluster.local:9094/api/prom/alertmanager/api/v1/alerts count=1 msg="Error sending alert" err="Post \"http://cortex-aggregator-v2-alertmanager-0.cortex-aggregator-v2-alertmanager-headless.cortex-2.svc.cluster.local:9094/api/prom/alertmanager/api/v1/alerts\": EOF"
AM Logs
level=debug ts=2022-09-13T16:33:23.849663134Z caller=cluster.go:329 component=cluster memberlist="2022/09/13 16:33:23 [ERR] memberlist: Received invalid msgType (80) from=10.115.16.92:58804\n"
Expected behavior Ruler should send notifications to all alertmanager pods
Environment: Cortex is deployed using the cortex helm chart v0.7.0. The version of cortex is 1.10.0
Additional Context I tried making a curl request from one of the ruler pods
curl -iv -XPOST http://cortex-aggregator-v2-alertmanager-0.cortex-aggregator-v2-alertmanager-headless.cortex-2.svc.cluster.local:9094/api/prom/alertmanager/api/v1/alerts
and got an empty response
curl: (52) Empty reply from server
AM had similar logs of invalid msgType (80)
Can I get some pointers on how to proceed here?
👋 Hi @roobalimsab are you in the Cortex slack channel?. Many Cortex users may be able to help you out more effectively in the Slack channel :)
curl: (52) Empty reply from server
is a generic error when server closes underlying connection without sending a
response. One thing I would check is if a GET request returns anything; if you run curl -iv -XGET http://cortex-aggregator-v2-alertmanager-0.cortex-aggregator-v2-alertmanager-headless.cortex-2.svc.cluster.local:9094/api/prom/alertmanager/api/v1/alerts
This issue seems more like a networking issue than Cortex issue :)
Also would it be possible for you to try to upgrade to latest Cortex version 1.13.1
and see if the problem persists? There are some known member list issue before 1.13.0.
This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.