backup-restore-operator BUG: Rancher pod killed during backup restore when pruning is enabled

BUG: Rancher pod killed during backup restore when pruning is enabled

Open mmartin24 opened this issue 7 months ago • 5 comments

Issue

While restoring a backup in Rancher with the exact same configuration as in the previous one, the rancher pod gets killed and Rancher UI dissappears as can be seen in the video

https://github.com/user-attachments/assets/3a6530e0-172d-48b2-963d-62dcde8352a1

Step to reproduce (long way)

Deploy Rancher 2.8.5
Deploy a git repo
Create a Backup using the Backup app on Rancher (preferably on S3)
Create a new Rancher 2.8.5
Deploy default Backup app
Restore previously backed image WITH prune enabled (marked and recommended by default)

Short version to reproduce

Deploy Rancher 2.8.5
Create the following secret

kubectl create secret generic aws-secret \
  --from-literal=accessKey=xxx \
  --from-literal=secretKey=yyy

Deploy default Backup app with this info:

bucketName: epinio-ci
  credentialSecretName: aws-secret
  credentialSecretNamespace: default
  enabled: true
  folder: mmt-rancher-backup
  endpoint: s3.eu-central-1.amazonaws.com
  endpointCA:
  insecureTLSSkipVerify: false
  region: eu-central-1

Go to Restore Backup and ensure "Prune" checkbox is marked
Using the previously pointed information target this bucket: 285-bu-07241230-da49091a-ddeb-4196-881d-d032bea9ea6e-2024-07-24T10-36-02Z.tar.gz. It basically contains this gitrepo on local cluster:

URL: https://github.com/rancher/fleet-examples
Branch: master
Path: simple

Observed Behavior

Rancher pod gets deleted and although tries to recreate is never able to do so

https://github.com/user-attachments/assets/bb3f3900-3245-4851-acf5-0eece0402760

Expected behavior

Pod should be able to recover well. Gitjob is correctly deployed and active

Additional info

This affects Rancher 2.7-head, 2.8-head and 2.9-head, however, it seemed not to affect Rancher 2.7.6(at least when tried for anoher issue)
If the recovery is done within the same cluster that the back was taken, the UI survives
If the restore is done WITHOUT pruning, the UI survives, although ocassionally the gitjob is in a waiting status

Testing environment

Single cluster k3s with k8s version v1.27.10+k3s1

Jul 25 '24 15:07 mmartin24

backup-restore-operator backup-restore-operator copied to clipboard

BUG: Rancher pod killed during backup restore when pruning is enabled

Issue

Step to reproduce (long way)

Short version to reproduce

Observed Behavior

Expected behavior

Additional info

Testing environment

backup-restore-operator
backup-restore-operator copied to clipboard