VolumeSnapshot fails with: resource name may not be empty
Version: 0.9.0-rc.6
Im trying to backup a standalone PVC with a simple backup configuration, following the guide:
apiVersion: stash.appscode.com/v1beta1
kind: BackupConfiguration
metadata:
name: default
spec:
schedule: "*/2 * * * *"
driver: VolumeSnapshotter
target:
ref:
apiVersion: v1
kind: PersistentVolumeClaim
name: pgadmin
snapshotClassName: csi-rbdplugin-snapclass
retentionPolicy: {
name: "default",
keepDaily: 7,
keepWeekly: 4,
keepMonthly: 6,
prune: true
}
The backup fails with these operator logs:
I0527 21:44:05.350737 1 jobs.go:68] Sync/Add/Update for Job stash-backup-default-1590615840
I0527 21:44:05.393717 1 jobs.go:68] Sync/Add/Update for Job stash-backup-default-1590615840
I0527 21:44:06.744865 1 backup_session.go:104] Sync/Add/Update for BackupSession default-1590615846
I0527 21:44:06.773364 1 job.go:36] Creating Job pgadmin/stash-vs-pvc-pgadmin-1590615846.
W0527 21:44:06.809239 1 backup_session.go:544] failed to ensure backup job. Reason: resource name may not be empty
W0527 21:44:06.895426 1 backup_session.go:99] BackupSession pgadmin/default-1590615726 does not exist anymore
E0527 21:44:06.895543 1 worker.go:92] Failed to process key pgadmin/default-1590615846. Reason: resource name may not be empty
I0527 21:44:06.895560 1 worker.go:96] Error syncing key pgadmin/default-1590615846: resource name may not be empty
I0527 21:44:06.895604 1 backup_session.go:104] Sync/Add/Update for BackupSession default-1590615846
I0527 21:44:06.895669 1 backup_session.go:112] Skipping processing BackupSession pgadmin/default-1590615846. Reason: phase is "Failed".
I0527 21:44:06.900750 1 backup_session.go:104] Sync/Add/Update for BackupSession default-1590615846
I0527 21:44:06.900793 1 backup_session.go:112] Skipping processing BackupSession pgadmin/default-1590615846. Reason: phase is "Failed".
I0527 21:44:07.585139 1 jobs.go:68] Sync/Add/Update for Job stash-backup-default-1590615840
I0527 21:44:07.585226 1 jobs.go:71] Deleting succeeded job stash-backup-default-1590615840
I0527 21:44:07.600267 1 jobs.go:82] Deleted stash job: stash-backup-default-1590615840
W0527 21:44:07.601029 1 jobs.go:64] Job pgadmin/stash-backup-default-1590615840 does not exist anymore
The Rook RBD snapshot class is present:
kubectl get volumesnapshotclasses
NAME AGE
csi-rbdplugin-snapclass 6h12m
Any idea when this will be investigated / fixed?
Seems like a serious issue for an rc6 version state.
@asoltesz Can you please try latest build from master?
helm install stash-operator appscode/stash \
--version v0.9.0-rc.6 \
--namespace kube-system \
--set operator.registry=appscodeci \
--set operator.tag=v0.9.0-rc.6-30-gae2d74fa_linux_amd64
@hossainemruz I have tried with the version you recommended.
There are no error messages now in the operator log:
I0601 20:26:08.225724 1 jobs.go:69] Sync/Add/Update for Job stash-backup-default-1591043160
I0601 20:26:08.225751 1 jobs.go:72] Deleting succeeded job stash-backup-default-1591043160
I0601 20:26:08.254625 1 jobs.go:83] Deleted stash job: stash-backup-default-1591043160
W0601 20:26:08.254662 1 jobs.go:65] Job pgadmin/stash-backup-default-1591043160 does not exist anymore
I0601 20:26:14.404786 1 pvc.go:56] Sync/Add/Update for PersistentVolumeClaim pgadmin/pgadmin
I0601 20:26:21.567866 1 pvc.go:56] Sync/Add/Update for PersistentVolumeClaim pgadmin/pgadmin
However, the snapshot has not been created because the "kubectl get volumesnapshot --all-namespaces" gets nothing back.
The description of the BackupConfiguration
kubectl describe backupconfiguration default
Name: default
Namespace: pgadmin
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"stash.appscode.com/v1beta1","kind":"BackupConfiguration","metadata":{"annotations":{},"name":"default","namespace":"pgadmin...
API Version: stash.appscode.com/v1beta1
Kind: BackupConfiguration
Metadata:
Creation Timestamp: 2020-06-01T20:11:58Z
Finalizers:
stash.appscode.com
Generation: 2
Resource Version: 16695
Self Link: /apis/stash.appscode.com/v1beta1/namespaces/pgadmin/backupconfigurations/default
UID: dddee1c9-d7d4-45d0-8906-8dbf481e54a2
Spec:
Driver: VolumeSnapshotter
Paused: false
Retention Policy:
Keep Daily: 7
Keep Monthly: 6
Keep Weekly: 4
Name: default
Prune: true
Schedule: */2 * * * *
Target:
Ref:
API Version: v1
Kind: PersistentVolumeClaim
Name: pgadmin
Snapshot Class Name: csi-rbdplugin-snapclass
Status:
Conditions:
Last Transition Time: 2020-06-01T20:11:58Z
Message: Backup target v1 persistentvolumeclaim/pgadmin found.
Reason: TargetAvailable
Status: True
Type: BackupTargetFound
Last Transition Time: 2020-06-01T20:11:58Z
Message: Successfully created backup triggering CronJob.
Reason: CronJobCreationSucceeded
Status: True
Type: CronJobCreated
Observed Generation: 2
Events: <none>
Note: I use Kubernetes 1.15.11 with the "snapshot.storage.k8s.io/v1alpha1" API.
There is now a v1beta1 API also. Can this be a problem? Which one does Stash target?
I can create a VolumeSnapshot manually like this:
apiVersion: snapshot.storage.k8s.io/v1alpha1
kind: VolumeSnapshot
metadata:
name: pgadmin-backup
namespace: pgadmin
spec:
snapshotClassName: csi-rbdplugin-snapclass
source:
name: pgadmin
kind: PersistentVolumeClaim
The resulting VolumeSnapshot and its VolumeSnapshotContent is visible with kubectl.
Can you describe any BackupSession?
I believe this belonged to the first execution today:
kubectl describe backupsession default-1591042691
Name: default-1591042691
Namespace: pgadmin
Labels: app.kubernetes.io/component=stash-backup
app.kubernetes.io/managed-by=stash.appscode.com
stash.appscode.com/invoker-name=default
stash.appscode.com/invoker-type=BackupConfiguration
Annotations: <none>
API Version: stash.appscode.com/v1beta1
Kind: BackupSession
Metadata:
Creation Timestamp: 2020-06-01T20:18:11Z
Generation: 1
Owner References:
API Version: stash.appscode.com/v1beta1
Block Owner Deletion: true
Controller: true
Kind: BackupConfiguration
Name: default
UID: dddee1c9-d7d4-45d0-8906-8dbf481e54a2
Resource Version: 16804
Self Link: /apis/stash.appscode.com/v1beta1/namespaces/pgadmin/backupsessions/default-1591042691
UID: d6659208-dc83-442f-8f24-5cc7eb0a6e80
Spec:
Invoker:
API Group: stash.appscode.com
Kind: BackupConfiguration
Name: default
Status:
Phase: Running
Targets:
Phase: Running
Ref:
Kind: PersistentVolumeClaim
Name: pgadmin
Total Hosts: 1
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal BackupSession Running 28m BackupSession Controller Backup job has been created succesfully/sidecar is watching the BackupSession.
Any log from the backup job?
Here they are:
kc logs stash-vs-pvc-pgadmin-1591044603-xtp97
I0601 20:50:04.574623 1 log.go:181] FLAG: --alsologtostderr="false"
I0601 20:50:04.574772 1 log.go:181] FLAG: --backupsession="default-1591044603"
I0601 20:50:04.574781 1 log.go:181] FLAG: --bypass-validating-webhook-xray="false"
I0601 20:50:04.574789 1 log.go:181] FLAG: --enable-analytics="true"
I0601 20:50:04.574797 1 log.go:181] FLAG: --help="false"
I0601 20:50:04.574805 1 log.go:181] FLAG: --kubeconfig=""
I0601 20:50:04.574811 1 log.go:181] FLAG: --log-flush-frequency="5s"
I0601 20:50:04.574820 1 log.go:181] FLAG: --log_backtrace_at=":0"
I0601 20:50:04.574835 1 log.go:181] FLAG: --log_dir=""
I0601 20:50:04.574842 1 log.go:181] FLAG: --logtostderr="true"
I0601 20:50:04.575139 1 log.go:181] FLAG: --master=""
I0601 20:50:04.575176 1 log.go:181] FLAG: --metrics-enabled="true"
I0601 20:50:04.575189 1 log.go:181] FLAG: --pushgateway-url="http://rolling-quetzal-stash.kube-system.svc:56789"
I0601 20:50:04.575197 1 log.go:181] FLAG: --service-name="stash-operator"
I0601 20:50:04.575206 1 log.go:181] FLAG: --stderrthreshold="0"
I0601 20:50:04.575214 1 log.go:181] FLAG: --target-kind="PersistentVolumeClaim"
I0601 20:50:04.575222 1 log.go:181] FLAG: --target-name="pgadmin"
I0601 20:50:04.575230 1 log.go:181] FLAG: --use-kubeapiserver-fqdn-for-aks="true"
I0601 20:50:04.575238 1 log.go:181] FLAG: --v="3"
I0601 20:50:04.575247 1 log.go:181] FLAG: --vmodule=""
W0601 20:50:04.663968 1 client_config.go:552] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
Error: the server could not find the requested resource (post volumesnapshots.snapshot.storage.k8s.io)
Usage:
stash create-vs [flags]
Flags:
--backupsession string Name of the respective BackupSession object
-h, --help help for create-vs
--kubeconfig string Path to kubeconfig file with authorization information (the master location is set by the master flag).
--master string The address of the Kubernetes API server (overrides any value in kubeconfig)
--metrics-enabled Specify whether to export Prometheus metrics (default true)
--pushgateway-url string Pushgateway URL where the metrics will be pushed
--target-kind string Kind of the Target
--target-name string Name of the Target
Global Flags:
--alsologtostderr log to standard error as well as files
--bypass-validating-webhook-xray if true, bypasses validating webhook xray checks
--enable-analytics Send analytical events to Google Analytics (default true)
--log-flush-frequency duration Maximum number of seconds between log flushes (default 5s)
--log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log_dir string If non-empty, write log files in this directory
--logtostderr log to standard error instead of files (default true)
--service-name string Stash service name. (default "stash-operator")
--stderrthreshold severity logs at or above this threshold go to stderr
--use-kubeapiserver-fqdn-for-aks if true, uses kube-apiserver FQDN for AKS cluster to workaround https://github.com/Azure/AKS/issues/522 (default true)
-v, --v Level log level for V logs (default 0)
--vmodule moduleSpec comma-separated list of pattern=N settings for file-filtered logging
F0601 20:50:04.706442 1 main.go:41] Error in Stash Main: the server could not find the requested resource (post volumesnapshots.snapshot.storage.k8s.io)
Note: I use Kubernetes 1.15.11 with the "snapshot.storage.k8s.io/v1alpha1" API. There is now a v1beta1 API also. Can this be a problem? Which one does Stash target?
It seems that is the case. Can you try v1beta1 API? Stash uses snapshot.storage.k8s.io/v1beta1.
Not easily.
I would have to migrate my stuff up to Kubernetes 1.17 because that API is only available from there.
Unfortunately Kubernetes 1.16+ has removed a lot of older API versions on which some of my stuff is based, so migration is non-trivial.
In any case, a big fat warning seems to be warranted in the Stash guides about the Volume Snapshotting feature currently only operational on Kubernetes 1.17+.
However, this severely limits the usability of Stash (big installations will certainly not migrate quickly to such new Kubernetes versions), so it might be worth thinking about supporting the older API version too.
In any case, a big fat warning seems to be warranted in the Stash guides about the Volume Snapshotting feature currently only operational on Kubernetes 1.17+.
Aggreed.
However, this severely limits the usability of Stash (big installations will certainly not migrate quickly to such new Kubernetes versions), so it might be worth thinking about supporting the older API version too.
It won't be easy for us to support both API versions. There are some breaking changes. We would rather go with the v1beta1 api.
It won't be easy for us to support both API versions. There are some breaking changes. We would rather go with the v1beta1 api.
I perfectly understand the issue, it would require a complexity increase in Stash.
If it is OK that major functionality of Stash only works with Kubernetes 1.17+ (a fairly recent version), then this is perfectly acceptable.
However, in order to be fair, this should be communicated clearly on the main documentation pages of Stash so that people don't waste their time with software that is not applicable for them.
Maybe there could also be a compact "Features" page in the documentation (like on the website but without fluff) that lists the major features and include the Kubernetes minimum - maximum version range it is capable of working in.
I am pretty certain that non-trivial Kubernetes installations will always have a hard time upgrading to newer versions due to the continuous backwards incompatibilities introduced into K8s versions (like the dropped v1alpha1 storage API we have just bumped into) and the wide and colorful ecosystem around it. I have a pretty hard time collecting the software that are capable of working together in a single cluster (true, I am not a Kubernetes veteran yet).
Maybe there could also be a compact "Features" page in the documentation (like on the website but without fluff) that lists the major features and include the Kubernetes minimum - maximum version range it is capable of working in.
I completely agree. We already planned something like this. We planned to write documentation acknowledging limitations of current release. However, we couldn't make it happen(actually, couldn't get time to do it when we planned this then forgot it later).
Btw, we have extensive E2E tests that check against Kubernetes 1.11.x to 1.18.x. Unfortunately, we don't have any E2E tests for VolumeSnapshot. That's why this problem got unnoticed. All our tests runs in Github Action. Since, VolumeSnapshot requires cloud provider specific CSI driver. Its not practical to run such tests. However, there is hostPath CSI driver that should work but we didn't get time to explore this.