cloud-platform icon indicating copy to clipboard operation
cloud-platform copied to clipboard

Review Disaster Recovery Scenario: Losing the whole cluster

Open sj-williams opened this issue 6 months ago • 2 comments

Background

This DR scenario documentation is now out of date - it relies on using the obsolete create-cluster ruby script (no longer working since we introduced the core terraform layer and has not been maintained since we switched to CP CLI).

Approach

Run through the steps for simulating total loss of a test cluster and restore using the CLI create cluster command.

Once testing is complete, update the runbook.

Finally, now would be a good time to remove the old ruby cluster scripts from the infra repo.

Questions / Assumptions

Has any part of the DR process changed other than the move to the CLI create cluster?

Definition of done

  • [ ] runbook has been updated
  • [ ] another team member has reviewed
  • [ ] smoke tests are green
  • [ ] prepare demo for the team

Reference

How to write good user stories

sj-williams avatar Aug 09 '24 08:08 sj-williams