Test that the prod deploy alert will alert if the backend and frontend can't talk to each other
In the recent postmortem, it came up that we weren't sure whether the recently added health check probe would "catch itself" even though it should. This ticket captures the work to prove to folks that it would.
See #6960 and this PR for more context
Acceptance criteria
- A video / series of screenshots that show that if the backend is down / Okta is down / the Feature Flag db call throws, the frontend status page at
https://www.simplereport.gov/app/health/deploy-smoke-testreturns false.- We have verified that the script that spins up a Selenium viewer / sends out a Slack alert works when the script fails, but worth proving that that link in the chain works as well.
- either in this ticket or create a follow up ticket for this: figure out how to page on-call person if this alert fires (first we want to test that it's working and hopefully improve noisiness)
To do the above, you'll probably need to
- Edit / duplicate a version of the prod deploy smoke test action to watch for a lower env
- Break the backend for that lower so that the frontend displays failure
- Screenshot / record that process to verify the alert triggers as expected
We also want to slightly tweak the alert due to it being slightly noisy: it seems to fail once every two dozen runs or so. On failure, might be worth taking a screenshot / logging out extra info for further investigation
Got a false positive run here: https://github.com/CDCgov/prime-simplereport/actions/runs/7588051271/job/20669739705#step:14:26
Backend/Okta/Feature flags.
It might be great to get a json blob on that page with values that tell us which of the services are up/down
Update alert to happen for a lower env so we can turn a backend off in the lowers and see this trigger.
smoke test endpoint intentionally broken and deployed to dev5 for testing
smoke test deploy dev workflow configured to run on this testing branch after a completed deploy dev run
dev5.simplereport.gov/app/health/deploy-smoke-test shows a failure status
Slack alert successfully triggered