failed to complete restore [..] no snapshot found
I am trying to validate my Stash backups by performing a restore to a different namespace. However, I am encountering the "no snapshot found" error during restore.
First, I performed a backup to Backblaze B2, and the snapshot seems to have been created successfully: the BackupSession is successful; in the bucket, /stash/gitea/snapshots/<long filename> exists; the /bin/restic check in the backup container logs indicates no errors were found. Also, SNAPSHOT-COUNT is 1:
$ kubectl get repository -n stash gitea-to-b2
NAME INTEGRITY SIZE SNAPSHOT-COUNT LAST-SUCCESSFUL-BACKUP AGE
gitea-to-b2 true 11.254 MiB 1 101m 106m
So, I created a new namespace gitea-restore and created the PVC, an Ingress with a different name, a RestoreSession according to https://stash.run/docs/v2022.07.09/guides/use-cases/cross-cluster-backup/, and the StatefulSet itself. The RestoreSession contains the same Repository as the BackupConfiguration; its YAML is below. Indeed, Stash created an init container in the StatefulSet, whose logs contain:
I0926 10:38:05.508833 1 restore.go:116] Got leadership, preparing for restore
I0926 10:38:05.572293 1 commands.go:233] Restoring backed up data
[golang-sh]$ /bin/restic restore latest --path /data --host host-0 --target / --cache-dir /tmp/restic-cache
latest snapshot for criteria not found: no snapshot found Paths:[/data] Hosts:[host-0]
I0926 10:38:10.950152 1 status.go:192] Updating hosts status for restore target StatefulSet gitea-restore/gitea.
F0926 10:38:11.092775 1 restore.go:123] failed to complete restore. Reason: latest snapshot for criteria not found: no snapshot found Paths:[/data] Hosts:[host-0]
[......goroutine stacks....]
I0926 10:38:11.092905 1 restore.go:129] Lost leadership
I0926 10:38:11.096042 1 restore.go:98] Restore completed successfully for RestoreSession gitea-restore/restore
I0926 10:38:11.096111 1 main.go:45] Exiting Stash Main
Two things jump out to me. First of all: the init container failed to restore, but it still said "restore completed successfully" and exited 0. So, now, my Gitea container is running without data, while I had expected the init container to fail and further setup of the application to block.
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 26 Sep 2022 12:38:04 +0200
Finished: Mon, 26 Sep 2022 12:38:11 +0200
But, worse still, the restore itself failed. With the RestoreSession pointing at the same Repository as the BackupSession (gitea-to-b2 in the namespace stash), I would have expected it to find the exact same backups in the same directory in the same bucket, but it finds no snapshots. I tried to take a look at the contents of the snapshot, but it's binary gibberish so I don't know what's in it. Could you help me figure out why this is / what's going on?
Here's the RestoreSession YAML:
apiVersion: stash.appscode.com/v1beta1
kind: RestoreSession
metadata:
name: restore
namespace: gitea-restore
spec:
repository:
name: gitea-to-b2
namespace: stash
target: # target indicates where the recovered data will be stored
ref:
apiVersion: apps/v1
kind: StatefulSet
name: gitea
volumeMounts:
- mountPath: /data
name: data
rules:
- paths:
- /data
Hello @sgielen! Can you show the yaml of the the BackupConfiguration?
Yes, here it is:
$ kubectl get backupconfiguration -n gitea -o yaml backup
apiVersion: stash.appscode.com/v1beta1
kind: BackupConfiguration
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"stash.appscode.com/v1beta1","kind":"BackupConfiguration","metadata":{"annotations":{},"name":"backup","namespace":"gitea"},"spec":{"backupHistoryLimit":3,"driver":"Restic","paused":false,"repository":{"name":"gitea-to-b2","namespace":"stash"},"retentionPolicy":{"keepLast":2,"name":"keep-last-2","prune":true},"runtimeSettings":{"container":{"securityContext":{"runAsGroup":0,"runAsUser":0}}},"schedule":"16 2 * * *","target":{"alias":"gitea","paths":["/data"],"ref":{"apiVersion":"apps/v1","kind":"StatefulSet","name":"gitea"},"volumeMounts":[{"mountPath":"/data","name":"data"}]},"tempDir":{"medium":"Memory","sizeLimit":"256Mi"},"timeOut":"1h"}}
creationTimestamp: "2022-09-26T09:11:59Z"
finalizers:
- stash.appscode.com
generation: 1
name: backup
namespace: gitea
resourceVersion: "1245525"
uid: adcd0ed8-99f4-4335-bc71-9b1f2b0aeb07
spec:
backupHistoryLimit: 3
driver: Restic
paused: false
repository:
name: gitea-to-b2
namespace: stash
retentionPolicy:
keepLast: 2
name: keep-last-2
prune: true
runtimeSettings:
container:
securityContext:
runAsGroup: 0
runAsUser: 0
schedule: 16 2 * * *
target:
alias: gitea
paths:
- /data
ref:
apiVersion: apps/v1
kind: StatefulSet
name: gitea
volumeMounts:
- mountPath: /data
name: data
tempDir:
medium: Memory
sizeLimit: 256Mi
timeOut: 1h
status:
conditions:
- lastTransitionTime: "2022-09-26T09:12:00Z"
message: Repository stash/gitea-to-b2 exist.
reason: RepositoryAvailable
status: "True"
type: RepositoryFound
- lastTransitionTime: "2022-09-26T09:12:00Z"
message: Backend Secret stash/b2-secret exist.
reason: BackendSecretAvailable
status: "True"
type: BackendSecretFound
- lastTransitionTime: "2022-09-26T09:12:00Z"
message: Successfully validated.
reason: ResourceValidationPassed
status: "True"
type: ValidationPassed
- lastTransitionTime: "2022-09-26T09:12:01Z"
message: Backup target StatefulSet gitea/gitea found.
reason: TargetAvailable
status: "True"
type: BackupTargetFound
- lastTransitionTime: "2022-09-26T09:12:01Z"
message: Successfully created backup triggering CronJob.
reason: CronJobCreationSucceeded
status: "True"
type: CronJobCreated
- lastTransitionTime: "2022-09-26T09:12:01Z"
message: Successfully injected stash sidecar into StatefulSet gitea/gitea
reason: SidecarInjectionSucceeded
status: "True"
type: StashSidecarInjected
observedGeneration: 1
phase: Ready
what are the mount paths provided in the backup and restore volumes? are they same?
Could you please provide some information about your Snapshot?
To list your Snapshots, run the following command:
kubectl get snapshots -n backup
Stash should return the successful Snapshots in the backup namespace after executing the command.
Please provide a YAML of a Snapshot from the list. To get the snapshot YAML, run the command:
kubectl get snapshots -n backup <snapshot-name> -o yaml
what are the mount paths provided in the backup and restore volumes? are they same?
I think so; RestoreSession:
target:
ref:
apiVersion: apps/v1
kind: StatefulSet
name: gitea
rules:
- paths:
- /data
volumeMounts:
- mountPath: /data
name: data
BackupConfiguration:
target:
alias: gitea
paths:
- /data
ref:
apiVersion: apps/v1
kind: StatefulSet
name: gitea
volumeMounts:
- mountPath: /data
name: data
Could you please provide some information about your Snapshot?
To list your Snapshots, run the following command:
kubectl get snapshots -n backup
I have no snapshots in either gitea nor gitea-restore namespace:
$ kubectl get snapshots -n gitea-restore
No resources found in gitea-restore namespace.
$ kubectl get snapshots -n gitea
No resources found in gitea namespace.
Interestingly, kubectl get snapshots -n stash ~~seems to hang forever~~ eventually returned:
NAME ID REPOSITORY HOSTNAME CREATED AT
[6 backups for another namespace]
gitea-to-b2-896834ac 896834ac gitea-to-b2 gitea-0 2022-09-26T09:17:25Z
One thing that also jumps out to me is that the backup host differs between backup and restore:
BackupSession status.targets[0] contains:
stats:
- duration: 19.145936467s
hostname: gitea-0
phase: Succeeded
snapshots:
- fileStats:
modifiedFiles: 0
newFiles: 1650
totalFiles: 1650
unmodifiedFiles: 0
name: 896834ac
path: /data
processingTime: "0:14"
totalSize: 10.776 MiB
uploaded: 11.254 MiB
totalHosts: 1
This mentions host gitea-0, while the restore mentions no backups found for host host-0. Could this be related?
Yes. You are right. You have used alias during backup. You have to use the same alias as source host during restore. Our documentation should have been more clear about this.
Try this RestoreSession.
apiVersion: stash.appscode.com/v1beta1
kind: RestoreSession
metadata:
name: restore
namespace: gitea-restore
spec:
repository:
name: gitea-to-b2
namespace: stash
target: # target indicates where the recovered data will be stored
ref:
apiVersion: apps/v1
kind: StatefulSet
name: gitea
volumeMounts:
- mountPath: /data
name: data
rules:
- paths:
- /data
sourceHost: gitea
With sourceHost: gitea-0 (not gitea) it works indeed!
I0926 14:47:32.345827 1 commands.go:412] sh-output: restoring <Snapshot 896834ac of [/data] at 2022-09-26 09:17:25.215975215 +0000 UTC by root@gitea-0> to /
Would you recommend not using an alias in BackupConfiguration? I don't know exactly where I got it from, but it was from somewhere in documentation.
So this issue #1484 is about two topics:
(1) alias / sourceHost configuration documentation
(2) if restore fails within the init container, it prints an error and a stack trace, but it still exits 0 with "restore completed successfully" which is not expected behavior.
Please let me know if you would prefer me filing a separate issue for either.
alias is supposed to be used with BatchBackup where you can backup multiple targets into the same Repository. So, an identifier is necessary to separate their data.
(2) if restore fails within the init container, it prints an error and a stack trace, but it still exits 0 with "restore completed successfully" which is not expected behavior.
Unfortunately, we can't detect if restore in this scenario failed or not. The restore process has found no data in the backend for the intended target. Hence, it restored nothing. It assumed that it has restored all data.