autopush-rs
autopush-rs copied to clipboard
Create an Alert for APNS outage
We recently had a complete APNs outage. This was reflected in out metrics in a few ways,
One: autoendpoint.notification.bridge.error.sum {platform: apns, reason: connection_unavailable}
(see https://earthangel-b40313e5.influxcloud.net/d/do4mmwcVz/autopush-gcp?orgId=1&viewPanel=57&from=1714756175489&to=1715044974289)
Two: autoendpoint.notification.bridge.error.sum {platform: apns} showing high activity autoendpoint.notification.bridge.sent.sum {platform: apns} showing low activity
We should establish alerts around these metrics (as well as a similar set for FCM metrics) to notice outages sooner.
┆Issue is synchronized with this Jira Bug