bottlerocket
bottlerocket copied to clipboard
Support NFS on EKS
We have an application from a third party vendor that we've been running in Kubernetes using a NFS volume to provide multiple-writer "local" storage. The NFS server is an AWS S3 File Gateway.
Image I'm using:
bottlerocket-aws-k8s-1.18-x86_64-v1.0.3-0c93e9a6 / ami-08f192e9c923b9f23 (eksctl selected this for me)
Attached to an EKS 1.18 (eks.2) cluster.
What I expected to happen:
NFS mount succeeds, allowing the pod to run. When the Pod is scheduled on an AmazonLinux2 Node Group it succeeds.
What actually happened:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 44m (x45 over 7h52m) kubelet Unable to attach or mount volumes: unmounted volumes=[redacted-vol-1 redacted-vol-2], unattached volumes=[redacted-vol-1 db-config default-token-pbxwv redacted-vol-2]: timed out waiting for the condition
Warning FailedMount 19m (x87 over 7h41m) kubelet Unable to attach or mount volumes: unmounted volumes=[redacted-vol-2 redacted-vol-1], unattached volumes=[redacted-vol-2 redacted-vol-1 db-config default-token-pbxwv]: timed out waiting for the condition
Warning FailedMount 4m52s (x239 over 7h55m) kubelet MountVolume.SetUp failed for volume "redacted-vol-1" : mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/7d54f7fe-9a03-4ff7-8ab4-100a13f49a5f/volumes/kubernetes.io~nfs/redacted-vol-1 --scope -- mount -t nfs redacted-nfs-server:/redacted-app/redacted-folder /var/lib/kubelet/pods/7d54f7fe-9a03-4ff7-8ab4-100a13f49a5f/volumes/kubernetes.io~nfs/redacted-vol-1
Output: mount: /var/lib/kubelet/pods/7d54f7fe-9a03-4ff7-8ab4-100a13f49a5f/volumes/kubernetes.io~nfs/redacted-vol-1: bad option; for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount. helper program.
How to reproduce the problem:
Using eksctl, create an EKS cluster with an unmanaged Node Group with amiFamily: Bottlerocket.
Create a Pod with a volume from a NFS server and a container that mounts that volume, e.g.
volumes:
- name: my-folder
nfs:
server: my-nfs-server
path: /my-s3-bucket/my-folder
volumeMounts:
- name: my-folder
mountPath: /my-folder
Hello @cwwarren, thank you for opening this issue. Are you using the EFS CSI driver? If so, we have a known issue that we are working on which I can describe further.
Hello @cwwarren, thank you for opening this issue. Are you using the EFS CSI driver? If so, we have a known issue that we are working on which I can describe further.
Thanks for the quick reply! We are not and this is not an EFS volume. It is an exposed volume on a S3 File Gateway instance.
Oh I see, sorry about that. So we're probably looking at something like this: https://github.com/kubernetes-csi/csi-driver-nfs but I don't know if anyone has tried it on Bottlerocket yet though.
No worries, thanks for looking into it. We don't have that CSI driver running and I believe it requires a DaemonSet so it definitely would've been noticed if EKS installed it without our knowledge.
I believe the error is coming from the in tree mounting code, from an (incomplete) skim it appears to be constructing the command that caused an error and the error messages format match up.
https://github.com/kubernetes/mount-utils/blob/72e9681f7438005a58689aee6d31b2346aceafce/mount_linux.go#L132
Upon a little further investigation and google I believe (but haven't confirmed) the root cause of this issue is /sbin/mount.nfs is missing.
This might actually be a feature request instead of a bug, "add NFS common utilities for /sbin/mount.nfs".
We tried on our side using the NFS CSI driver and found the same dependency at issue, /sbin/mount.nfs. I agree we can handle this as a feature request. I am going to label it as such. Do you mind if we rename if something like "Support NFS on EKS" so that it's easy to identify? Thank you for bringing this to our attention!
Yes please edit and categorize however makes it easiest for your team.
Hi @cwwarren, it would seem that currently the NFS CSI driver is in alpha and does not yet support pod inline volumes like the one you're trying to do here.
The workaround is to create a persistent volume backed by your NFS server and mount that for your pods. You can find the driver parameters here.
Please let us know if that works for you.
Hey @etungsten thanks for looking into it. We're not (yet) using the NFS CSI driver, we're still using the in-tree plugin so I doubt in our specific case we're running into any limitations of the CSI driver here. In this case this is an existing Deployment that runs successfully on AWS EKS AmazonLinux2 nodes (and before that Ubuntu and Debian on a kops cluster).
Those steps are a good reference though, thanks. I'll note that for when we do migrate to the CSI driver in the near future.
hi guys, any plans to get this one implemented anytime soon ? I'm using AWS EKS v1.19 + Bottlerocket amazon-eks-gpu-node-1.19-v20210504 and when trying to use a PVC from AWS EFS (NFS) I have got the below error:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m21s default-scheduler Successfully assigned tyk/gateway-tyk-pro-lmtbc to ip-10-80-125-180.eu-central-1.compute.internal
Warning FailedMount 78s kubelet Unable to attach or mount volumes: unmounted volumes=[geoip], unattached volumes=[tyk-mgmt-gateway-conf tyk-scratch tyk-pro-default-cert geoip default-token-b72n6]: timed out waiting for the condition
Warning FailedMount 72s (x9 over 3m20s) kubelet MountVolume.SetUp failed for volume "pvc-8cfab7c8-9269-4586-8ffa-314b76d8d409" : mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/2028a85a-81dc-422f-933d-056a856a68da/volumes/kubernetes.io~nfs/pvc-8cfab7c8-9269-4586-8ffa-314b76d8d409 --scope -- mount -t nfs -o vers=4.1 10.80.120.174:/persistent-volumes/xxxxxx/tyk/tyk-efs-volume-pvc-8cfab7c8-9269-4586-8ffa-314b76d8d409 /var/lib/kubelet/pods/2028a85a-81dc-422f-933d-056a856a68da/volumes/kubernetes.io~nfs/pvc-8cfab7c8-9269-4586-8ffa-314b76d8d409
Output: mount: /var/lib/kubelet/pods/2028a85a-81dc-422f-933d-056a856a68da/volumes/kubernetes.io~nfs/pvc-8cfab7c8-9269-4586-8ffa-314b76d8d409: bad option; for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount.<type> helper program.
@webern @gregdek
when trying to use a PVC from AWS EFS (NFS) I have got the below error:
I think this is different then the OP who is trying to mount S3 as an NFS.
The EFS CSI driver was known to be working fairly recently when I fixed this incompatibility:
- https://github.com/kubernetes-sigs/aws-efs-csi-driver/pull/286
- https://github.com/bottlerocket-os/bottlerocket/issues/1111
When working on it, the instructions I followed to deploy an EFS PVC were these: https://docs.aws.amazon.com/eks/latest/userguide/efs-csi.html
Does the user guide match what you are doing?
@webern yes, in my case the efs csi driver is also installed but still the same error.
Warning FailedMount 12s (x6 over 27s) kubelet, ip-10-80-127-28.eu-central-1.compute.internal MountVolume.SetUp failed for volume "pvc-8cfab7c8-9269-4586-8ffa-314b76d8d409" : mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/7912b416-c5a4-4be6-98e7-e794b003ddaa/volumes/kubernetes.io~nfs/pvc-8cfab7c8-9269-4586-8ffa-314b76d8d409 --scope -- mount -t nfs -o vers=4.1 10.80.120.174:/persistent-volumes/eks2/tyk/tyk-efs-volume-pvc-8cfab7c8-9269-4586-8ffa-314b76d8d409 /var/lib/kubelet/pods/7912b416-c5a4-4be6-98e7-e794b003ddaa/volumes/kubernetes.io~nfs/pvc-8cfab7c8-9269-4586-8ffa-314b76d8d409
Output: mount: /var/lib/kubelet/pods/7912b416-c5a4-4be6-98e7-e794b003ddaa/volumes/kubernetes.io~nfs/pvc-8cfab7c8-9269-4586-8ffa-314b76d8d409: bad option; for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount.<type> helper program.
**kubectl get pod -n kube-system -l app=efs-csi-node -o wide**
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
efs-csi-node-kkj8r 3/3 Running 0 7m28s 10.80.127.28 ip-10-80-127-28.eu-central-1.compute.internal <none> <none>
**kubectl get nodes -o wide**
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ip-10-80-127-28.eu-central-1.compute.internal Ready <none> 7m47s v1.19.9 10.80.127.28 <none> Bottlerocket OS 1.0.8 5.4.105 containerd://1.4.4+bottlerocket
In case it is not related maybe there is already an issue open on that topic ?
maybe there is already an issue open on that topic ?
There was, but it was fixed. Do you mind opening a new one? I suggest a title like "Unable to mount EFS volume with the aws-efs-csi-driver". We'll try to figure out what's going on there and leave this issue for the more general-purpose NFS issue, such as supporting csi-driver-nfs.
@webern sure thing, pls refer to #1599
Hey,
is there any update when NFS support is added for Bottlerocket? We try to migrate from Amazon Linux 2 to Bottlerocket, but some of our pods require NFS server access.
We are currently using them via the nfs server type in volumes:
spec:
volumes:
- name: nfs-server
nfs:
server: 10.10.1.1
path: /path
@basert, thanks for reaching out! If you're running on AWS, our understanding is you should be able to mount the volumes with the aws-efs-csi-driver detailed above. If not, can you elaborate on your use-case?
In our usecase, multiple legacy deployments are using the same EFS share to access files. We tried getting this working with the efs-csi driver, but failed, because we need to mount the pre-existing EFS volume to multiple deployments.
The EFS driver either needs pre-created persistent volumes (which can't be claimed by multiple deployments in different namespaces) or a storage class, which requires an access point. The driver will force a basePath on the volume (https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/f6d289667ea71f2c2d1a9ec78b0224f066369e40/pkg/driver/controller.go#L201), so we can't use the same file root on multiple pods.
Both solutions do not work for us.
Now that I think about this a bit, I'm not sure we did everything correct when trying to migrate to the efs-csi driver. Logic wise it should work just fine. I will test again and check if I can come up with a reproducible case :)
Hello there
I cannot use the NFS CSI driver because it does not support read only NFS endpoints. So this is still a problem and the basic "NFS" mount should be supported.
Hey @duckie , could you please help us clarify the use case that you have, do you want to:
- Use a NFS filesystem exposed in a NFS server as read-only, i .e. :
# /etc/exports
/nfsshare my.domain(ro)
- Mount the NFS filesystem in the pod as read-only, i. e.:
mount.nfs -o ro <> <>
For 1) we might have to ask the NFS CSI driver maintainers to support the use case, for 2) I think you can add mount options, as described in the NFS CSI README.
Hello team, do you know if a solution to the problem came out?
@mlorenzo92, thanks for reaching out. Are you able to take advantage of the aws-efs-csi-driver?
@jpculp thank's. it's working perfect. Install the driver and kiss me a lot in this doc: https://docs.aws.amazon.com/eks/latest/userguide/efs-csi.html
We have verified https://github.com/kubernetes-csi/csi-driver-nfs successfully on bottlerocket. The newer version v4.1.0 at least seems to work fine without needing nfs-common or other binaries dependencies on the host os.
The StorageClass to mount an on-prem NFS share was defined like
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs-sc
provisioner: nfs.csi.k8s.io
parameters:
server: <nfs-domain>
share: /<share-path>
subDir: <subDir>
reclaimPolicy: Delete
volumeBindingMode: Immediate
mountOptions:
- nfsvers=3
- hard
- nolock
Thanks for sharing @gazal-k ! @cwwarren, could you please try a newer version of the CSI driver and let us know if you are still having problems?
I have since switched jobs and am no longer working with Kubernetes or EFS so I’m unable to readily test. However the comments in this thread seem to indicate it’s all working now! I’m more than happy to have this be closed out.
Many thanks to the team for getting this shipped!
A lot of history here in the comments, but if I understand correctly everything is now working as expected. If not, please open a new issue to track any problems. Thanks!