ol-infrastructure icon indicating copy to clipboard operation
ol-infrastructure copied to clipboard

Alert on S3 403 errors

Open pdpinch opened this issue 2 years ago • 0 comments

User Story

Periodically, we get errors like this:

https://sentry.io/organizations/mit-office-of-digital-learning/issues/3420772109/?project=1413655&query=is%3Aunresolved

My understanding is that these are caused by an AWS S3 key expiring.

When the error is on RC, it can be noisy and slow engineers work deploying code.

When the error in on production, it breaks the site.

Description/Context

We should alert on this error, so I can stop watching for it in sentry and 500 errors sent to sentry.

Acceptance Criteria

  • [ ] high priority alert on production systems
  • [ ] low priority alert on RC/QA and CI systems
  • [ ] We need this across all apps in production. If they must be enumerated, I have seen this error on:

pdpinch avatar Jul 15 '22 12:07 pdpinch