aws-efs-csi-driver
aws-efs-csi-driver copied to clipboard
EKS Addon install missing AWS_DEFAULT_REGION
/kind bug
Thanks in advance for looking into this, and thanks for maintaining this great project :)
What happened?
When I install the EKS Addon (tested via terraform or AWS console), with deleteAccessPointRootDir = true
, IRSA configured, and restrict access to IMDS, when I delete a pvc, I see these errors in my logs, and the PVC never gets deleted
E1015 08:53:11.829540 1 mount_linux.go:231] Mount failed: exit status 1
Mounting command: mount
Mounting arguments: -t efs -o tls,iam fs-XXXXXXXXXXXXXXXXXXX /var/lib/csi/pv/fsap-XXXXXXXXXXXXX Output: Error retrieving region. Please set the "region" parameter in the efs-utils configuration file.
What you expected to happen? I expect the EKS Addon to work out of the box.
How to reproduce it (as minimally and precisely as possible)? This assumes you've restricted access to IMDS from your pods (by setting a hop limit). Docs here.
-
Install the efs-csi-driver EKS Addon on a cluster with
deleteAccessPointRootDir = true
, with an IRSA service account -
Tail the logs (in a separate terminal)
kubectl logs deployment/efs-csi-controller -f -n kube-system
-
Create a storageClass, PVC and pod (dynamic provisioning)
# test.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: test
parameters:
basePath: /test
directoryPerms: "775"
ensureUniqueDirectory: "false"
fileSystemId: fs-XXXXXXX
gid: "65534"
provisioningMode: efs-ap
subPathPattern: /
uid: "65534"
provisioner: efs.csi.aws.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
mountOptions:
- tls
- iam
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
storageClassName: test
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: test
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/instance: test
template:
metadata:
labels:
app.kubernetes.io/instance: test
spec:
containers:
- image: registry.k8s.io/pause:3.9
name: test
resources:
requests:
cpu: 20m
memory: 2Mi
volumeMounts:
- mountPath: /test
name: test
volumes:
- name: test
persistentVolumeClaim:
claimName: test
-
kubectl apply -f test.yaml
-
kubectl delete -f test.yaml
- see the logs for efs-csi-controller
Anything else we need to know?:
The reason this happens is because when the driver is installed installed via EKS Addon, the efs-plugin
container has the AWS_REGION
environment variable set.
apiVersion: apps/v1
kind: Deployment
metadata:
name: efs-csi-controller
namespace: kube-system
resourceVersion: "8596255"
uid: 09438d06-c1b8-4765-89f6-e696c648d19f
spec:
template:
spec:
containers:
- name: efs-plugin
env:
- name: CSI_ENDPOINT
value: unix:///var/lib/csi/sockets/pluginproxy/csi.sock
- name: AWS_REGION
value: ap-southeast-2
- name: CSI_NODE_NAME
With how IRSA works, if there's already an AWS_REGION
variable, it doesn't add the AWS_DEFAULT_REGION
variable that the container needs to see what region it's in without calling out to IMDS. At a glance it doesn't look like this would affect people installing via Helm or kustomize.
This should be simple to fix, either:
- Remove that environment variable from the container, IRSA adds it back anyway, although I guess it could break things for people not using IRSA?
- Add the
AWS_DEFAULT_REGION
variable explicitly also.
Could possibly relate to:
- Relates https://github.com/kubernetes-sigs/aws-efs-csi-driver/issues/1111#issuecomment-1999592464
Environment
- Kubernetes version (use
kubectl version
):
kubectl version
Client Version: v1.30.4
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.4-eks-a737599
- Driver version:
v2.0.7-eksbuild.1
Please also attach debug logs to help us better diagnose
- Instructions to gather debug logs can be found here