tidb-operator icon indicating copy to clipboard operation
tidb-operator copied to clipboard

Backup-manager fails to update backup object

Open schnapsidee opened this issue 2 years ago • 0 comments

Bug Report

What version of Kubernetes are you using? 1.25.9 (RKE2)

What version of TiDB Operator are you using? 1.4.5 Also tried with v1.6.0.-alpha-1, same issue.

What did you do?

Deployed a backup cr to test backup procedures. br container returns

BR copy finished

but backup container can't seem to update the object:

Create rclone.conf file.
2023-06-29T11:38:35.278199255Z /tidb-backup-manager backup --namespace=tidb --backupName=backup-1 --tikvVersion=v7.1.0 --mode=snapshot
E0629 11:38:35.371796       9 reflector.go:138] k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1alpha1.Backup: unknown (get backups.pingcap.com)
2023-06-29T11:38:35.459155308Z I0629 11:38:35.458994       9 backup.go:78] start to process backup tidb/backup-1
2023-06-29T11:38:35.459610523Z I0629 11:38:35.459497       9 manager.go:107] snapshot backup tidb/backup-1 was restarted, status is Scheduled
2023-06-29T11:38:35.482868462Z I0629 11:38:35.482567       9 backup_status_updater.go:123] Backup: [tidb/backup-1] updated successfully
2023-06-29T11:38:35.494508432Z E0629 11:38:35.494302       9 backup_status_updater.go:126] Failed to update backup [tidb/backup-1], error: Operation cannot be fulfilled on backups.pingcap.com "backup-1": the object has been modified; please apply your changes to the latest version and try again
2023-06-29T11:38:35.514177557Z E0629 11:38:35.514008       9 backup_status_updater.go:126] Failed to update backup [tidb/backup-1], error: Operation cannot be fulfilled on backups.pingcap.com "backup-1": the object has been modified; please apply your changes to the latest version and try again
2023-06-29T11:38:35.533084459Z E0629 11:38:35.532902       9 backup_status_updater.go:126] Failed to update backup [tidb/backup-1], error: Operation cannot be fulfilled on backups.pingcap.com "backup-1": the object has been modified; please apply your changes to the latest version and try again
2023-06-29T11:38:35.555893484Z E0629 11:38:35.555683       9 backup_status_updater.go:126] Failed to update backup [tidb/backup-1], error: Operation cannot be fulfilled on backups.pingcap.com "backup-1": the object has been modified; please apply your changes to the latest version and try again
2023-06-29T11:38:35.591938030Z E0629 11:38:35.591573       9 backup_status_updater.go:126] Fa
iled to update backup [tidb/backup-1], error: Operation cannot be fulfilled on backups.pingcap.com "backup-1": the object has been modified; please apply your changes to the latest version and try again
2023-06-29T11:38:35.592108035Z Error from server (Conflict): Operation cannot be fulfilled on backups.pingcap.com "backup-1": the object has been modified; please apply your changes to the latest version and try again

This is the backup object after 3 retries:

apiVersion: pingcap.com/v1alpha1
kind: Backup
metadata:
  name: backup-1
  namespace: tidb
spec:
  backoffRetryPolicy:
    maxRetryTimes: 5
    minRetryDuration: 60s
    retryTimeout: 10m
  backupMode: snapshot
  backupType: full
  br:
    cluster: tidb-test
    clusterNamespace: tidb
    concurrency: 1
    rateLimit: 1
  cleanPolicy: OnFailure
  resources: {}
  s3:
    bucket: [REDACTED]
    endpoint: [REDACTED]
    provider: minio
    region: us-east-1
    secretName: minio-access
  serviceAccount: default
  storageSize: 10Gi
  toolImage: pingcap/br
status:
  backoffRetryStatus:
    - detectFailedAt: '2023-06-29T11:34:03Z'
      expectedRetryAt: '2023-06-29T11:35:03Z'
      realRetryAt: '2023-06-29T11:35:33Z'
      retryNum: 1
      retryReason: Pod backupt-backup-1-5b9tm has failed
    - detectFailedAt: '2023-06-29T11:36:03Z'
      expectedRetryAt: '2023-06-29T11:38:03Z'
      realRetryAt: '2023-06-29T11:38:33Z'
      retryNum: 2
      retryReason: Pod backup-backup-1-r8kns has failed
    - detectFailedAt: '2023-06-29T11:39:03Z'
      expectedRetryAt: '2023-06-29T11:43:03Z'
      retryNum: 3
      retryReason: Pod backup-backup-1-zmxbm has failed
  backupPath: s3://bimplan-surreal
  conditions:
    - lastTransitionTime: '2023-06-29T11:38:33Z'
      status: 'True'
      type: Scheduled
    - lastTransitionTime: '2023-06-29T11:33:50Z'
      status: 'True'
      type: Prepare
    - lastTransitionTime: '2023-06-29T11:38:33Z'
      message: 'reason Pod backup-backup-1-r8kns has failed, original reason '
      reason: RetryFailedBackup
      status: 'True'
      type: RetryFailed
    - lastTransitionTime: '2023-06-29T11:38:35Z'
      status: 'True'
      type: Restart
  phase: Scheduled
  timeCompleted: null
  timeStarted: null

Tried it several times, happens at every attempt.

I checked previous issues and found #4658, but that seems to have been closed without further information. This is a test deployment and the database doesn't contain much data, so the backup is done very fast. I'm not sure if that's relevant, but I figured I might as well give as much information as possible.

schnapsidee avatar Jun 29 '23 11:06 schnapsidee