velero
velero copied to clipboard
BackupRepository - error to ensure repository storage empty - found existing data in storage location
What steps did you take and what happened: I ran a successful backup. I uninstalled Velero, reinstalled it, ran a new backup, and got the following error: Velero creates BackupRepository with error: error to init backup repo: error to create repo with storage: error to ensure repository storage empty: found existing data in storage location
What did you expect to happen: I would expect the backup to be fine again, regardless of whether the bucket is empty or not. I don't want to have to delete the data from the bucket and lose the previous backups.
Logs: velero debug --backup teste4 bundle-2024-02-21-14-18-43.tar.gz
Anything else you would like to add:
kubectl -n velero describe backuprepository ccp-default-kopia-4zkr5 :
Name: ccp-default-kopia-4zkr5
Namespace: velero
Labels: velero.io/repository-type=kopia
velero.io/storage-location=default
velero.io/volume-namespace=ccp
Annotations: <none>
API Version: velero.io/v1
Kind: BackupRepository
Metadata:
Generate Name: ccp-default-kopia-
Spec:
Backup Storage Location: default
Maintenance Frequency: 1h0m0s
Repository Type: kopia
Restic Identifier: s3:s3-eu-central-1.amazonaws.com/eks-backup-308173258961/restic/ccp
Volume Namespace: ccp
Status:
Message: error to create backup repo: error to create repo with storage: error to ensure repository storage empty: found existing data in storage location
Phase: NotReady
Events: <none>
kubectl -n velero get backup teste4 -o yaml :
apiVersion: velero.io/v1
kind: Backup
metadata:
annotations:
meta.helm.sh/release-name: velero
meta.helm.sh/release-namespace: velero
velero.io/resource-timeout: 10m0s
velero.io/source-cluster-k8s-gitversion: v1.27.9-eks-5e0fdde
velero.io/source-cluster-k8s-major-version: "1"
velero.io/source-cluster-k8s-minor-version: 27+
creationTimestamp: "2024-02-21T13:35:50Z"
generation: 73
labels:
app.kubernetes.io/instance: velero
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: velero
helm.sh/chart: velero-5.3.0
velero.io/schedule-name: velero-eks-backup
velero.io/storage-location: default
name: teste4
namespace: velero
resourceVersion: "74760882"
uid: fc0db8df-e50f-4a4f-b3bb-8c46756e66e5
spec:
csiSnapshotTimeout: 10m0s
defaultVolumesToFsBackup: false
hooks: {}
itemOperationTimeout: 4h0m0s
metadata: {}
resourcePolicy:
kind: configmap
name: velero-efs-resourcepolicy
snapshotMoveData: true
storageLocation: default
ttl: 240h0m0s
volumeSnapshotLocations:
- default
Environment:
- Velero version (use
velero version
): v1.13.0 - Velero features (use
velero client config get features
): features: EnableCSI - Kubernetes version (use
kubectl version
): v1.27.9-eks-5e0fdde - Kubernetes installer & version: v1.24.1
- Cloud provider or hardware configuration: AWS
- OS (e.g. from
/etc/os-release
):
Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.
- :+1: for "I would like to see this bug fixed as soon as possible"
- :-1: for "There are more important bugs to focus on right now"
In normal case, Velero shouldn't fail with existing backup repository, because the Velero first try to connect the repository, if failed, then create new repository.
Could you please check whether your repository is still integrated?
There should have metadata files in the backup repository.
The backup repository of the namespace "ccp" was not created on the bucket, but the folder still exists in the bucket.
k get backuprepository -n velero
NAME AGE REPOSITORY TYPE
ccp-default-kopia-b9hsl 20h kopia
monitoring-default-kopia-qxsns 20h kopia
k describe backuprepository -n velero ccp-default-kopia-b9hsl | grep Message
Message: error to create backup repo: error to create repo with storage: error to ensure repository storage empty: found existing data in storage location
aws s3 ls --recursive s3://eks-backup-308173258961 | grep repository
2024-02-20 15:12:27 1075 kopia/monitoring/kopia.repository
aws s3 ls --recursive s3://eks-backup-308173258961 | grep ccp
2024-02-07 12:15:42 4953 kopia/ccp/_log_20240207121540_96b9_1707308140_1707308141_1_61dd9d822a48b9347151554792cdec54
2024-02-07 12:15:44 4989 kopia/ccp/_log_20240207121542_ac05_1707308142_1707308143_1_a00a29adf4b089aaa324d246602805e8
2024-02-07 12:15:46 5028 kopia/ccp/_log_20240207121544_e7d9_1707308144_1707308145_1_7fe9885b6e1223937461f2ff0cb1856a
2024-02-07 12:15:45 4298 kopia/ccp/q62cc98a40dcf13ccd32b2d19475e5ddf-s79d755f1bbf7a5df125
The Kopia repository file was gone. I think that is the reason causing the failure. Velero met similar issues for a while, but I don't think Velero makes the mess.
Please check whether there is any lifecycle-related policy created for this bucket. The repository file doesn't get any update after the repository is initialized. The file could be deleted first when there is an expired policy created. https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lifecycle-mgmt.html
Hello, this bucket has no lifecycle policy associated with it.
OK. Is it possible to find the trace of the object deletion in any AWS service?
Unfortunately no. I also tried this workaround #6909 , but didn't work.
@filipe-silva-magalhaes-alb I think the only possible option here is to delete the stale backup repository data in the OSS and delete the failed BackupRepository in the cluster.
@filipe-silva-magalhaes-alb I haven't produced this issue in my EKS environment yet. Is this issue easy to reproduce in your environment?
I ran a successful backup. I uninstalled Velero, reinstalled it, ran a new backup, and got the following error
If this issue can be produced by running these commands sequentially, it's worthwhile to find the reason.
If you can reproduce this, would you mind helping us debug it with some AWS S3 log turning on? https://repost.aws/knowledge-center/s3-audit-deleted-missing-objects The CloudTrail watching specific object's deletion should be the most convenient way.
I upgraded to velero 1.13 and then started getting errors below. I tired first successful against empty S3 bucket then failed on second backup . I am running velero with on 'readOnlyRootFilesystem: true' using default image.
Any help appreciated
Errors: Velero: name: /mongodb-0 message: /Error backing up item error: /failed to wait BackupRepository: backup repository is not ready: error to connect to backup repo: error to connect repo with storage: error to connect to repository: unable to write config file: unable to create config directory: mkdir /home/cnb/udmrepo: read-only file system name: /mongodb-1 message: /Error backing up item error: /failed to wait BackupRepository: backup repository is not ready: error to connect to backup repo: error to connect repo with storage: error to connect to repository: unable to write config file: unable to create config directory: mkdir /home/cnb/udmrepo: read-only file system name: /mongodb-2 message: /Error backing up item error: /failed to wait BackupRepository: backup repository is not ready: error to connect to backup repo: error to connect repo with storage: error to connect to repository: unable to write config file: unable to create config directory: mkdir /home/cnb/udmrepo: read-only file system Cluster: <none>
I was able to get backup to run successfully by adding this section to chart input with velero running on read-only rootfs.
extraVolumes:
- emptyDir: {}
name: udmrepo
- emptyDir: {}
name: cache
extraVolumeMounts:
- mountPath: /home/cnb/udmrepo
name: udmrepo
- mountPath: /home/cnb/.cache
name: cache
@kingnarmer Sorry, I missed this thread notification.
I suppose you are using Kopia as the uploader, and running the container as user cnb
?
@blackpiglet I use kopia as uploader. I didn't make any change to whatever default user is used in container.
@kingnarmer Thanks. This topic may be worth noting somewhere in the Velero document.