emblem
emblem copied to clipboard
Delivery: Deepen canary rollout health check
Proposal
Improve the health check that gates canary rollout from a pass/fail curl to a deeper check. We do not want to continue a rollout if a health check fails. This change does not include implementing and automated rollback if the check fails.
Work is needed to determine if this should look like a suite of end-to-end tests , or a rich health check endpoint that reports on a checklist of key behaviors. The latter would be useful for monitoring, load balancing, and other use cases.
In addition to server failure and functional errors, we are also interested in request latency and other attributes of good behavior.
Problems this will solve
The existing check is too shallow to be a reliable indicator the deployment is successful.
Additional Context
User Journeys: #26
Related to user journey https://github.com/GoogleCloudPlatform/emblem/issues/26