alertmanager
alertmanager copied to clipboard
opsgenie_config using api_key_file not working
What did you do? using vault injector to inject the api key - has issue the /vault/secrets/opsgenie_api_key , the content is the apikey file owner is nobody, same as alertmanager user/group,. its mode is 644 or 777, tried both
same alert can be routed to slack, but cant be opsgenie
using plain text api key value - works What did you expect to see? thought it should work, but so far no luck, need some guidance pls What did you see instead? Under which circumstances? ts=2024-03-13T02:42:59.388Z caller=notify.go:848 level=warn component=dispatcher receiver=opsgenie integration=opsgenie[0] aggrGroup="{}/{severity=~"^(?:critical|error)$"}:{}" msg="Notify attempt failed, will retry later" attempts=1 err="Post "https://api.opsgenie.com/v2/alerts": net/http: invalid header field value for "Authorization"" Environment
-
System information:
insert output of
uname -srm
here -
Alertmanager version:
insert output of
alertmanager --version
here (repeat for each alertmanager version in your cluster, if relevant to the issue) 0.26.0 and 0.27.0 -
Prometheus version:
insert output of
prometheus --version
here (repeat for each prometheus version in your cluster, if relevant to the issue) 2.47.0 -
Alertmanager configuration file:
global: {}
receivers:
- name: opsgenie
opsgenie_configs:
- api_key_file: /vault/secrets/opsgenie_api_key
message: "{{ range .Alerts }} \n{{ .Annotations.summary }}\n{{ end }}"
priority: '{{ if .CommonAnnotations.priority }}{{ .CommonAnnotations.priority
}}{{ else }}P3{{ end }}'
responders:
- name: devops
type: team
route:
group_interval: 5m
group_wait: 10s
receiver: alerts-slack
repeat_interval: 3h
routes:
- continue: true
match_re:
severity: critical|error
receiver: opsgenie
- Logs:
ts=2024-03-13T02:42:59.388Z caller=notify.go:848 level=warn component=dispatcher receiver=opsgenie integration=opsgenie[0] aggrGroup="{}/{severity=~\"^(?:critical|error)$\"}:{}" msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://api.opsgenie.com/v2/alerts\": net/http: invalid header field value for \"Authorization\""
This looks like something is wrong with the api key and not alertmanager. Have you verified, eg in a test pod, that /vault/secrets/opsgenie_api_key
really contains the correct key?
This looks like something is wrong with the api key and not alertmanager. Have you verified, eg in a test pod, that
/vault/secrets/opsgenie_api_key
really contains the correct key?
thanks for reply, yes, the file has the correct key id, funny thing is using the same way to do opsgenie heatbeat, using same key, works for deadman switch
- name: prometheus-deadman-switch
webhook_configs:
- url: https://api.opsgenie.com/v2/heartbeats/xxxxxx/ping
send_resolved: false
http_config:
basic_auth:
username: ':'
password_file: /vault/secrets/opsgenie_api_key
One is an opsgenie_configs
the other is a http_config
. You are using /vault/secrets/opsgenie_api_key
as a password in the latter indicating to me that it contains a paassword and not an API key.
@zoezhangmattr any feedback?
One is an
opsgenie_configs
the other is ahttp_config
. You are using/vault/secrets/opsgenie_api_key
as a password in the latter indicating to me that it contains a paassword and not an API key.
no, as i mentioned before, the api key is working if using k8s secret , same api key, the password is correct, it should be opsgenie api key in this case
Hi @zoezhangmattr! Does the file exist and contain the secret at the time the Alertmanager is started? It sounds like there might be a race condition between the Alertmanager starting and vault-injector writing the file.
I'm also running into this. If I go ahead and put the api key directly as a string into opsgenie_config/api_key
of the receiver, it works.
When using opsgenie_config/api_key_file
and a secret that's correctly mounted, it breaks with the exact same API key and Alertmanager logs invalid header field value for \"Authorization\"
.
I'm also running into this. If I go ahead and put the api key directly as a string into
opsgenie_config/api_key
of the receiver, it works.When using
opsgenie_config/api_key_file
and a secret that's correctly mounted, it breaks with the exact same API key and Alertmanager logsinvalid header field value for \"Authorization\"
.
Can you check this?
Does the file exist and contain the secret at the time the Alertmanager is started? It sounds like there might be a race condition between the Alertmanager starting and vault-injector writing the file.
@grobinson-grafana, perhaps to add: I'm not using Vault to inject the file at hand. I'm deploying using Helm and there's no init containers involved (aside from config-reloader).
So given that the secret is deployed beforehand and I'm not injecting using Vault, I'm assuming the file is present before Alertmanager starts given standard Kubernetes pod lifecycle mgmt, right?
Let me see if I can figure out how to add a short magic sleep before the Alertmanager process starts, in the meantime Alertmanager values for reference:
...
alertmanager:
enabled: true
alertmanagerSpec:
image:
registry: quay.io
repository: prometheus/alertmanager
tag: v0.27.0
sha: ""
secrets:
- opsgenie-api-key
config:
global:
resolve_timeout: 5m
route:
group_by: ['namespace']
group_wait: 30s
group_interval: 5m
repeat_interval: 12h
receiver: 'null'
routes:
- receiver: 'null'
matchers:
- job !~ "fdbmeter.*"
- receiver: 'opsgenie'
matchers:
- job =~ "fdbmeter.*"
receivers:
- name: 'null'
- name: 'opsgenie'
opsgenie_configs:
- tags: 'integrities,foundationdb'
api_key_file: /etc/alertmanager/secrets/opsgenie-api-key/opsgenie
...
Went ahead and modified the statefulset as such:
containers:
- command: [
"/bin/sh", "-c"
]
args:
- cat "/etc/alertmanager/secrets/opsgenie-api-key/opsgenie";
/bin/alertmanager --config.file=/etc/alertmanager/config_out/alertmanager.env.yaml ...;
And it outputs my API key just fine, which makes me doubt there's a race condition at play here, anything else I can test here?
I ended up adding some additional logging to the Opsgenie notifier to print the headers before alerting and lo and behold, there's a newline attached to my API key:
ts=2024-06-24T14:32:51.702Z caller=opsgenie.go:296 level=info integration=opsgenie SETAUTHHEADERTO:="GenieKey redacted-api-key-foo-bar\n"
So I'll have a look at how I'm templating my secret file.
Edit: Also correct me if I'm wrong here but from looking at the code, I doubt this'll ever be a race condition since the API key is read from the file each time a HTTP request to OpsGenie is being built, and seemingly not being persisted in the notifier config struct. See this routine here.
Im encountering the same issue with this configuration, im not using secrets in anyway, im setting the api key as plain text.
alertmanager:
config:
global:
resolve_timeout: 5m
route:
group_wait: 30s
group_interval: 5m
repeat_interval: 3h
receiver: opsgenie
routes:
- match: {}
receiver: opsgenie
receivers:
- name: opsgenie
opsgenie_configs:
- api_key: <plain-api-key>
responders:
- name: <team-name>
type: team
This is the error message i see in the logs:
ts=2024-07-10T09:23:26.444Z caller=notify.go:745 level=warn component=dispatcher receiver=kube-prometheus-stack/alertmanager-config-management/opsgenie integration=opsgenie[0] aggrGroup="{}/{}:{alertname=\"KubeVersionMismatch\", prometheus=\"kube-prometheus-stack/kube-prometheus-stack-prometheus\", severity=\"warning\"}" msg="Notify attempt failed, will retry later" attempts=1 err="Post \"https://api.opsgenie.com/v2/alerts\": net/http: invalid header field value for \"Authorization\""
Any help??
@zoezhangmattr did you manage to resolve this?
We had this same issue today, and the cause was that our api key secret ended in a newline character before it was base64 encoded.