cloud-platform
cloud-platform copied to clipboard
Review Disaster Recovery Scenario: Losing the whole cluster
Background
This DR scenario documentation is now out of date - it relies on using the obsolete create-cluster ruby script (no longer working since we introduced the core
terraform layer and has not been maintained since we switched to CP CLI).
Approach
Run through the steps for simulating total loss of a test cluster and restore using the CLI create cluster command.
Once testing is complete, update the runbook.
Finally, now would be a good time to remove the old ruby cluster scripts from the infra repo.
Questions / Assumptions
Has any part of the DR process changed other than the move to the CLI create cluster?
Definition of done
- [ ] runbook has been updated
- [ ] another team member has reviewed
- [ ] smoke tests are green
- [ ] prepare demo for the team