radix-platform icon indicating copy to clipboard operation
radix-platform copied to clipboard

Radix Disaster Recovery 2025

Open emirgens opened this issue 7 months ago • 1 comments

Create a scenario based on Silver confidence level, with Disaster Scenarios "B+C" Silver:

  • The Recovery Time Objective (RTO) for Silver tier is normally 24 - 48 hours.
  • The Recovery Point Objective (RPO) for Silver tier is normally 8 - 24 hours.

RTO: The period of time within which systems, applications, or functions must be recovered after an outage. RPO: Maximum amount of data, as measured by time, that can be lost after a recovery from an incident.

Disaster Scenario:

  • A: ~~Disruptive Cyber-attack~~ Not part of this scenario
  • B: Physical Failure
  • C: Data Deletion (or data don't have replication enabled or support)

More info: TDI Assurance - see links in this ppt WR2939 DR Overview documentation DR Tabletop test template

Story:

Azure region West Europe got flooded, and rebuild resources C2 to a paired region (North Europe)

Rebuild everything in C2 to C3 What/How:

Type Name Action Notes
ResourceGroup clusters-c3 Create Not required
Public IP Prefix ippre-radix-aks-c3-northeurope-001 Create Required
Public IP Prefix ippre-ingress-radix-aks-c3-northeurope-001 Create Required
Public IP address pip-ingress-radix-aks-c3-northeurope-001 Create 001 to 008
Public IP address pip-radix-aks-c2-northeurope-001 Create 001 to 016

https://docs.omnia.equinor.com/governance/disaster-recovery/dr-overview/

emirgens avatar May 06 '25 11:05 emirgens

Started working on a mind-map-ish of resources

https://miro.com/app/board/uXjVI064TDY=/

Based on https://github.com/equinor/radix-private/blob/master/images/Radix-components.drawio.png

Richard87 avatar May 16 '25 13:05 Richard87