velero
velero copied to clipboard
PVC restore in EKS does not work with IRSA
What steps did you take and what happened: I recently converted to using IRSA with the same policy (the one specified in your docs). I then wiped the cluster, installed velero (again, with IRSA), and did a restore. Everything restored OK except PVCs. Those gave a 403 unauthorized error. This was odd because obviously the S3 stuff at least was working, which meant IRSA was set up correctly.
I then reverted velero to using a regular IAM user. The restore worked fine. I think there is a bug somewhere in the EBS restore related to using IRSA. As I said the policy was the same, so i dont know what else it could be.
What did you expect to happen:
I expected the PVCs to be restored
The following information will help us better understand what's going on:
If you are using velero v1.7.0+:
Please use velero debug --backup <backupname> --restore <restorename>
to generate the support bundle, and attach to this issue, more options please refer to velero debug --help
If you are using earlier versions:
Please provide the output of the following commands (Pasting long output into a GitHub gist or other pastebin is fine.)
-
kubectl logs deployment/velero -n velero
-
velero backup describe <backupname>
orkubectl get backup/<backupname> -n velero -o yaml
-
velero backup logs <backupname>
-
velero restore describe <restorename>
orkubectl get restore/<restorename> -n velero -o yaml
-
velero restore logs <restorename>
Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]
Environment:
- Velero version (use
velero version
):
Client:
Version: v1.8.1
Git commit: -
Server:
Version: v1.8.1
- Velero features (use
velero client config get features
):
features: <NOT SET>
- Kubernetes version (use
kubectl version
):
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"clean", BuildDate:"2022-03-16T15:51:05Z", GoVersion:"go1.17.8", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.12-eks-a64ea69", GitCommit:"d4336843ba36120e9ed1491fddff5f2fec33eb77", GitTreeState:"clean", BuildDate:"2022-05-12T18:29:27Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.23) and server (1.21) exceeds the supported minor version skew of +/-1
- Kubernetes installer & version:
terraform-aws-eks (18.21.0)
- Cloud provider or hardware configuration:
AWS
- OS (e.g. from
/etc/os-release
):amazon-linux-2
bundle-2022-06-01-16-41-02.tar.gz
Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.
- :+1: for "I would like to see this bug fixed as soon as possible"
- :-1: for "There are more important bugs to focus on right now"
Am I alone in this or have other people been having this issue?
@reasonerjt any news on this?
This is important to me as well!!
@bit-herder I tried on my lab (velero v1.10 aws-plugin v1.6) and it seems the IRSA does work.
I also checked the restore log in your log bundle and only found a bunch of errors calling some webhook:
cat ./restore_restore-1.log|grep "level=error"
time="2022-06-01T19:56:08Z" level=error msg="error restoring ingress-sdl-connector: Internal error occurred: failed calling webhook \"validate.nginx.ingress.kubernetes.io\": Post \"https://nginx-ingress-ingress-nginx-controller-admission.nginx-ingress.svc:443/networking/v1/ingresses?timeout=10s\": context deadline exceeded" logSource="pkg/restore/restore.go:1287" restore=velero/restore-1
time="2022-06-01T19:56:18Z" level=error msg="error restoring ingress-dev-proxy: Internal error occurred: failed calling webhook \"validate.nginx.ingress.kubernetes.io\": Post \"https://nginx-ingress-ingress-nginx-controller-admission.nginx-ingress.svc:443/networking/v1/ingresses?timeout=10s\": context deadline exceeded" logSource="pkg/restore/restore.go:1287" restore=velero/restore-1
......
@alievrouw Could you clarify if you work with @bit-herder or you see some similar error when using IRSA?
Additionally, it seems during installation, there's not option for the user to set the service account for velero pod.
@sseago
Was it discussed and determined not to add it?
If no objection I can write a PR to add that option.
@reasonerjt I don't think I've heard this particular issue coming up. Making it configurable at install makes sense, though, as long as the default behavior (with no user setting) is equivalent to current behavior.
The PR #5802 which adds an option for user to set the service account has been merged.
I'm closing this issue as non-reproducible