EfsCsiDriverAddOn: mount.nfs4: access denied by server while mounting 127.0.0.1:/
Describe the bug
When deploying a StorageClass, PersistentVolumeClaim and Pod while using the EfsCsiDriverAddOn to dynamically provision an EFS Access Point and mount it to the Pod, mounting fails with the error mount.nfs4: access denied by server while mounting 127.0.0.1:/.
Expected Behavior
Mounting the EFS Access Point to the Pod succeeds.
Current Behavior
Running kubectl describe pod/efs-app shows the following Event logs for the Pod:
Name: efs-app
Namespace: default
Priority: 0
Service Account: default
Node: ip-XXX-XXX-XXX-XXX.eu-west-1.compute.internal/XXX.XXX.XXX.XXX
Start Time: Thu, 01 Aug 2024 13:08:05 +0200
Labels: <none>
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Containers:
app:
Container ID:
Image: centos
Image ID:
Port: <none>
Host Port: <none>
Command:
/bin/sh
Args:
-c
while true; do echo $(date -u) >> /data/out; sleep 5; done
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/data from persistent-storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zg68d (ro)
Conditions:
Type Status
PodReadyToStartContainers False
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
persistent-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: efs-claim
ReadOnly: false
kube-api-access-zg68d:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 18s default-scheduler Successfully assigned default/efs-app to ip-XXX-XXX-XXX-XXX.eu-west-1.compute.internal
Warning FailedMount 8s (x5 over 17s) kubelet MountVolume.SetUp failed for volume "pvc-XXXXXXX" : rpc error: code = Internal desc = Could not mount "fs-XXXXXXX:/" at "/var/lib/kubelet/pods/XXXXXXX/volumes/kubernetes.io~csi/pvc-XXXXXXX/mount": mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t efs -o accesspoint=fsap-XXXXXXX,tls fs-XXXXXXX:/ /var/lib/kubelet/pods/XXXXXXX/volumes/kubernetes.io~csi/pvc-XXXXXXX/mount
Output: Could not start amazon-efs-mount-watchdog, unrecognized init system "aws-efs-csi-dri"
b'mount.nfs4: access denied by server while mounting 127.0.0.1:/'
Warning: config file does not have fips_mode_enabled item in section mount.. You should be able to find a new config file in the same folder as current config file /etc/amazon/efs/efs-utils.conf. Consider update the new config file to latest config file. Use the default value [fips_mode_enabled = False].Warning: config file does not have retry_nfs_mount_command item in section mount.. You should be able to find a new config file in the same folder as current config file /etc/amazon/efs/efs-utils.conf. Consider update the new config file to latest config file. Use the default value [retry_nfs_mount_command = True].
However, the creation of the EFS Access Point does succeed, as seen in the AWS Console and via command kubectl describe pvc/efs-claim:
Name: efs-claim
Namespace: default
StorageClass: efs-sc
Status: Bound
Volume: pvc-XXXXXXX
Labels: <none>
Annotations: pv.kubernetes.io/bind-completed: yes
pv.kubernetes.io/bound-by-controller: yes
volume.beta.kubernetes.io/storage-provisioner: efs.csi.aws.com
volume.kubernetes.io/storage-provisioner: efs.csi.aws.com
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 5Gi
Access Modes: RWX
VolumeMode: Filesystem
Used By: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ExternalProvisioning 17s persistentvolume-controller Waiting for a volume to be created either by the external provisioner 'efs.csi.aws.com' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
Normal Provisioning 17s efs.csi.aws.com_efs-csi-controller-XXXXXXX External provisioner is provisioning volume for claim "default/efs-claim"
Normal ProvisioningSucceeded 17s efs.csi.aws.com_efs-csi-controller-XXXXXXX Successfully provisioned volume pvc-XXXXXXX
Then the details of the Storage Class, by running command kubectl describe sc/efs-sc:
Name: efs-sc
IsDefaultClass: No
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{},"name":"efs-sc"},"parameters":{"basePath":"/dynamic_provisioning","directoryPerms":"700","ensureUniqueDirectory":"true","fileSystemId":"fs-XXXXXXX","gidRangeEnd":"2000","gidRangeStart":"1000","provisioningMode":"efs-ap","reuseAccessPoint":"false","subPathPattern":"${.PVC.namespace}/${.PVC.name}"},"provisioner":"efs.csi.aws.com"}
Provisioner: efs.csi.aws.com
Parameters: basePath=/dynamic_provisioning,directoryPerms=700,ensureUniqueDirectory=true,fileSystemId=fs-XXXXXXX,gidRangeEnd=2000,gidRangeStart=1000,provisioningMode=efs-ap,reuseAccessPoint=false,subPathPattern=${.PVC.namespace}/${.PVC.name}
AllowVolumeExpansion: <unset>
ReclaimPolicy: Delete
VolumeBindingMode: Immediate
Events: <none>
Lastly I have also checked the efs-csi-controller logs using command kubectl logs deployment/efs-csi-controller -n kube-system -c efs-plugin:
I0801 10:38:43.866497 1 config_dir.go:63] Mounted directories do not exist, creating directory at '/etc/amazon/efs'
I0801 10:38:43.867231 1 metadata.go:65] getting MetadataService...
I0801 10:38:43.868837 1 metadata.go:70] retrieving metadata from EC2 metadata service
I0801 10:38:43.871827 1 driver.go:150] Did not find any input tags.
I0801 10:38:43.872040 1 driver.go:116] Registering Node Server
I0801 10:38:43.872062 1 driver.go:118] Registering Controller Server
I0801 10:38:43.872074 1 driver.go:121] Starting efs-utils watchdog
I0801 10:38:43.872155 1 efs_watch_dog.go:216] Copying /etc/amazon/efs/efs-utils.conf since it doesn't exist
I0801 10:38:43.872242 1 efs_watch_dog.go:216] Copying /etc/amazon/efs/efs-utils.crt since it doesn't exist
I0801 10:38:43.873827 1 driver.go:127] Starting reaper
I0801 10:38:43.883901 1 driver.go:137] Listening for connections on address: &net.UnixAddr{Name:"/var/lib/csi/sockets/pluginproxy/csi.sock", Net:"unix"}
I0801 11:07:24.454475 1 controller.go:286] Using user-specified structure for access point directory.
I0801 11:07:24.454501 1 controller.go:292] Appending PVC UID to path.
I0801 11:07:24.454523 1 controller.go:310] Using /dynamic_provisioning/default/efs-claim-XXXXXXX as the access point directory.
Reproduction Steps
- Deploy an EKS Blueprints stack with only the EfsCsiDriverAddOn and a VPC Resource Provider and an EFS Resource Provider.
// lib/stack.ts
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import {
EksBlueprint,
ClusterAddOn,
EfsCsiDriverAddOn,
GlobalResources,
VpcProvider,
CreateEfsFileSystemProvider,
} from '@aws-quickstart/eks-blueprints';
export default class ClusterConstruct extends Construct {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id);
const account = props?.env?.account!;
const region = props?.env?.region!;
const addOns: Array<ClusterAddOn> = [
new EfsCsiDriverAddOn({
replicaCount: 1
}),
];
const blueprint = EksBlueprint.builder()
.version('auto')
.account(account)
.region(region)
.resourceProvider(GlobalResources.Vpc, new VpcProvider())
.resourceProvider("efs-file-system", new CreateEfsFileSystemProvider({name: "efs-file-system"}))
.addOns(...addOns)
.build(scope, id + '-eks-efs-poc');
}
}
- Deploy the StorageClass (with updated FS-id), PersistentStorageClaim and Pod from the official aws-efs-csi-driver repository dynamic_provisioning example. I do this one by one in the mentioned order.
- Check the mount status of the Pod with
kubectl describe pod/efs-app. Optionally also check the PVC and SC withkubectl describe pvc/efs-claimandkubectl describe sc/efs-sc.
Possible Solution
Not sure.
Additional Information/Context
I have done the following troubleshooting, which all result in the same error:
- Manually added
mountOptionsto the provided StorageClass withiamandtlsincluded. This was suggested in this AWS re:Post.
kind: StorageClass
...
mountOptions:
- tls
- iam
...
- Configured the my cluster nodes with a custom role that includes AWS
AmazonElasticFileSystemClientReadWriteAccesspolicy. This dit not fix the issue.
// ...
const nodeRole = new CreateRoleProvider("blueprint-node-role", new cdk.aws_iam.ServicePrincipal("ec2.amazonaws.com"),
[
cdk.aws_iam.ManagedPolicy.fromAwsManagedPolicyName("AmazonEKSWorkerNodePolicy"),
cdk.aws_iam.ManagedPolicy.fromAwsManagedPolicyName("AmazonEC2ContainerRegistryReadOnly"),
cdk.aws_iam.ManagedPolicy.fromAwsManagedPolicyName("AmazonSSMManagedInstanceCore"),
cdk.aws_iam.ManagedPolicy.fromAwsManagedPolicyName("AmazonEKS_CNI_Policy"),
cdk.aws_iam.ManagedPolicy.fromAwsManagedPolicyName("CloudWatchAgentServerPolicy"),
cdk.aws_iam.ManagedPolicy.fromAwsManagedPolicyName("AmazonElasticFileSystemClientReadWriteAccess"), // <-
]);
const mngProps: MngClusterProviderProps = {
version: cdk.aws_eks.KubernetesVersion.of('auto'),
instanceTypes: [new cdk.aws_ec2.InstanceType("m5.xlarge")],
amiType: cdk.aws_eks.NodegroupAmiType.AL2_X86_64,
nodeRole: getNamedResource("node-role") as cdk.aws_iam.Role,
desiredSize: 2,
maxSize: 3,
};
// ...
const blueprint = EksBlueprint.builder()
// ...
.clusterProvider(new MngClusterProvider(mngProps)) // <-
.resourceProvider("node-role", nodeRole) // <-
// ...
- Checked if EFS CSI Driver provisions the EFS Access Point correctly, which it does.
- Checked the EFS File System Policy, which looks alright.
- Checked if EFS is in the same VPC as the EKS Cluster, which it is.
- Checked if EFS Security Groups allow inbound NFS:2049 traffic, which it does.
CDK CLI Version
2.133.0 (build dcc1e75)
EKS Blueprints Version
1.15.1
Node.js Version
v20.11.0
Environment details (OS name and version, etc.)
Win11Pro22H2
Other information
While I'm uncertain of the exact cause, I assume it is IAM related. I found a similar issue on the EKS Blueprints for Terraform repository (https://github.com/aws-ia/terraform-aws-eks-blueprints/issues/1171), which has been solved (https://github.com/aws-ia/terraform-aws-eks-blueprints/pull/1191). Perhaps this has a similar cause? I believe it might be related because the mount-option mention in the fix does not seem to be included in the mount-command in the above EKS Blueprints for CDK logs (specifically the Pod Event logs).
@JonVDB please check the content on EFS filesystem and EFS addon in our workshop for security patterns in EKS here: https://catalog.us-east-1.prod.workshops.aws/workshops/90c9d1eb-71a1-4e0e-b850-dba04ae92887/en-US/security/065-data-encryption/1-stack-setup
You will see steps and policies to configure your EFS filesystem with e2e encryption. Please let me know if that solves the issue, we can then update the docs with that reference.
@shapirov103 Hey, I wasn't aware that there was a Workshop for the EFS CSI Driver. I've only used the QuickStart docs. The instructions in the Workshop work perfectly! Issue solved. Thank you!
I had the same issue but the guide did not provide a solution. When I changed the Policy to be more permissive It finally worked. So not sure the policy works for everyone:
const eksFileSystemPolicy = new iam.PolicyDocument({
statements: [new iam.PolicyStatement({
effect: iam.Effect.ALLOW,
principals: [new iam.AnyPrincipal()],
actions: [
"elasticfilesystem:ClientRootAccess",
"elasticfilesystem:ClientMount",
"elasticfilesystem:ClientWrite"
],
conditions: {
Bool: { "elasticfilesystem:AccessedViaMountTarget": "true" }
}
})]
})
When conditions where removed it worked to mount without access denied message.