k3d icon indicating copy to clipboard operation
k3d copied to clipboard

[BUG] cannot mount nfs shares from inside pods

Open fragolinux opened this issue 2 years ago • 16 comments

What did you do

i initially tested openebs nfs-provisioner, on top of k3d default local-path storage class... pvc where created, but pods could not mount them, saying "not permitted" or "not supported"... i could mount the shares from inside the openebs nfs sharing pods, even between them (a pod could mount its own shares, AND the shares of the other pod, sharing a different pvc)... but NO OTHER pods could mount them, they all remain in containerCreating state, and i've those errors in events...

so i tried a different solution, an nfs server docker container running on my host machine, and connect to it using the nfs subdir provisioner, with identical results, so it seems i cannot get an RWX volume on k3d right now, whatever solution i do... tested on both my dev machine (macbook pro, big sur latest) AND on an ubuntu 22.04 vm (with of course the nfs-common package installed on it)

  • How was the cluster created?
docker network create --subnet="" --gateway="" "internalNetwork"

k3d cluster create test --network internalNetwork

mkdir -p ~/nfsshare

docker run -d --net=internalNetwork -p 2049:2049 --name nfs --privileged -v ~/nfsshare:/nfsshare -e SHARED_DIRECTORY=/nfsshare itsthenetwork/nfs-server-alpine:latest

helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/

helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner     --set nfs.server=host.k3d.internal --set nfs.path=/

the pod stays in "containerCreating" state, and in events i get:

Message:             MountVolume.SetUp failed for volume "nfs-subdir-external-provisioner-root" : mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t nfs host.k3d.internal:/ /var/lib/kubelet/pods/ddeba612-5e1c-4ae6-8068-f641f42706ca/volumes/kubernetes.io~nfs/nfs-subdir-external-provisioner-root
Output: mount: mounting host.k3d.internal:/ on /var/lib/kubelet/pods/ddeba612-5e1c-4ae6-8068-f641f42706ca/volumes/kubernetes.io~nfs/nfs-subdir-external-provisioner-root failed: Not supported

so, let's try from an ubuntu pod

kubectl run ubuntu --image ubuntu sleep infinity
# shell inside, then:
apt update
apt install nfs-common -y
mkdir t
mount -t nfs host.k3d.internal:/ t # host is correctly resolved using dig...
mount.nfs: Operation not permitted

test from host to see if the share works: it does...

mkdir ~/nfsshare
sudo mount -t nfs localhost:/ t0
touch t0/aaa
ls t0 # aaa exists
ls ~/nfsshare # and is visible in the share
touch ~/nfsshare/bbb
ls ~/nfsshare # bbb exists in share
ls t0 # and is visible in local folder

What did you expect to happen

share should be mountable, to create rwx volumes

Which OS & Architecture

  • output of k3d runtime-info
arch: x86_64
cgroupdriver: systemd
cgroupversion: "2"
endpoint: /var/run/docker.sock
filesystem: extfs
name: docker
os: Ubuntu 22.04 LTS
ostype: linux
version: 20.10.12

Which version of k3d

  • output of k3d version
k3d version v5.4.4
k3s version v1.23.8-k3s1 (default)

Which version of docker

  • output of docker version and docker info
 Version:           20.10.12
 API version:       1.41
 Go version:        go1.17.3
 Git commit:        20.10.12-0ubuntu4
 Built:             Mon Mar  7 17:10:06 2022
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

  Version:          20.10.12
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.17.3
  Git commit:       20.10.12-0ubuntu4
  Built:            Mon Mar  7 15:57:50 2022
  OS/Arch:          linux/amd64
  Experimental:     false
  Version:          v1.6.6
  GitCommit:        10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
  Version:          1.1.3
  GitCommit:        v1.1.3-0-g6724737f
  Version:          0.19.0
 Context:    default
 Debug Mode: false

 Containers: 5
  Running: 5
  Paused: 0
  Stopped: 0
 Images: 24
 Server Version: 20.10.12
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runtime.v1.linux runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
 runc version: v1.1.3-0-g6724737f
 init version:
 Security Options:
   Profile: default
 Kernel Version: 5.15.0-41-generic
 Operating System: Ubuntu 22.04 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 3.827GiB
 Name: lima-ubuntu
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Experimental: false
 Insecure Registries:
 Live Restore Enabled: false

fragolinux avatar Jul 23 '22 16:07 fragolinux

update: tried ganesha nfs provisioner, too, same setup as above... even thats, fails to create usable nfs shares... my pods are now in a different state (CreateContainerConfigError), and i get this in events...

Message:             MountVolume.SetUp failed for volume "pvc-8352d87c-c342-4777-892f-ef94f02d8ded" : mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t nfs -o vers=4 /var/lib/kubelet/pods/b56c8042-25aa-4d7b-9045-2b0827c03c8d/volumes/kubernetes.io~nfs/pvc-8352d87c-c342-4777-892f-ef94f02d8ded
Output: mount: mounting on /var/lib/kubelet/pods/b56c8042-25aa-4d7b-9045-2b0827c03c8d/volumes/kubernetes.io~nfs/pvc-8352d87c-c342-4777-892f-ef94f02d8ded failed: Stale file handle

all the other stuff is the same, the storageclass is always "nfs", this is my helmrelease:

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
  name: ${release}
  namespace: flux-system
      chart: nfs-server-provisioner # nfs-provisioner
      version: "1.4.0" # "0.9.0"
        kind: HelmRepository
        name: ${release}
  interval: 1h0m0s
  releaseName: ${release}
  targetNamespace: ${namespace}

    # see https://artifacthub.io/packages/helm/kvaps/nfs-server-provisioner
      enabled: true
      storageClass: "local-path"
      size: 1Gi
      defaultClass: false
      name: nfs
      reclaimPolicy: Retain
        - "vers=4"

fragolinux avatar Jul 23 '22 19:07 fragolinux

i just tried even the rook nfs provisioner, that too does not work on k3d 5.4.4 with errors:

Unable to attach or mount volumes: unmounted volumes=[rook-nfs-vol], unattached volumes=[rook-nfs-vol kube-api-access-tv299]: timed out waiting for the condition

Message:             MountVolume.SetUp failed for volume "pvc-8352d87c-c342-4777-892f-ef94f02d8ded" : mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t nfs -o vers=4 /var/lib/kubelet/pods/a068abfc-8347-4f76-87b2-f9269b37c0db/volumes/kubernetes.io~nfs/pvc-8352d87c-c342-4777-892f-ef94f02d8ded
Output: mount: mounting on /var/lib/kubelet/pods/a068abfc-8347-4f76-87b2-f9269b37c0db/volumes/kubernetes.io~nfs/pvc-8352d87c-c342-4777-892f-ef94f02d8ded failed: Stale file handle

pvc are regularly created and bound, but pods cannot mount them and write...

fragolinux avatar Jul 24 '22 13:07 fragolinux


It's not a bug of k3d but a defect of k3s docker image.

K3s docker image is build from scratch with no nfs support.Dockerfile

As as result of that, both k3s node container and pods inside of the node can not mount nfs.

There is a workaround: rebase k3s image with alpine and install nfs-utils.

FROM alpine:latest

RUN set -ex; \
    apk add --no-cache iptables ip6tables nfs-utils; \
    echo 'hosts: files dns' > /etc/nsswitch.conf

COPY --from=rancher/k3s:v1.24.3-k3s1 /bin /opt/k3s/bin

VOLUME /var/lib/kubelet
VOLUME /var/lib/rancher/k3s
VOLUME /var/lib/cni
VOLUME /var/log

ENV PATH="$PATH:/opt/k3s/bin:/opt/k3s/bin/aux"
ENV CRI_CONFIG_FILE="/var/lib/rancher/k3s/agent/etc/crictl.yaml"

ENTRYPOINT ["/opt/k3s/bin/k3s"]
CMD ["agent"]

Build it yourself or have a look at mine. maoxuner/k3s (not managed frequently)

I don't known how to patch nfs-utils into official k3s image. Anyone know it please tell me.

maoxuner avatar Aug 19 '22 08:08 maoxuner

@pawmaster tried that, but didn't work for me... hints?

k3d cluster create test -i maoxuner/k3s:v1.24.3-k3s1
INFO[0000] Prep: Network
INFO[0000] Created network 'k3d-test'
INFO[0000] Created image volume k3d-test-images
INFO[0000] Starting new tools node...
INFO[0000] Starting Node 'k3d-test-tools'
INFO[0001] Creating node 'k3d-test-server-0'
INFO[0001] Creating LoadBalancer 'k3d-test-serverlb'
INFO[0001] Using the k3d-tools node to gather environment information
INFO[0001] Starting new tools node...
INFO[0001] Starting Node 'k3d-test-tools'
INFO[0003] Starting cluster 'test'
INFO[0003] Starting servers...
INFO[0003] Starting Node 'k3d-test-server-0'
ERRO[0003] Failed Cluster Start: Failed to start server k3d-test-server-0: Node k3d-test-server-0 failed to get ready: error waiting for log line `k3s is up and running` from node 'k3d-test-server-0': stopped returning log lines
ERRO[0003] Failed to create cluster >>> Rolling Back
INFO[0003] Deleting cluster 'test'
INFO[0004] Deleting cluster network 'k3d-test'
INFO[0004] Deleting 2 attached volumes...
WARN[0004] Failed to delete volume 'k3d-test-images' of cluster 'test': failed to find volume 'k3d-test-images': Error: No such volume: k3d-test-images -> Try to delete it manually
FATA[0004] Cluster creation FAILED, all changes have been rolled back!

fragolinux avatar Aug 26 '22 09:08 fragolinux

I created a similar image, based on the version i need (1.22) and have same issues... something missing in dockerfile?

k3d cluster create test -i ghcr.io/ecomind/k3s-nfs:1.22.12-k3s1
INFO[0000] Prep: Network
INFO[0000] Created network 'k3d-test'
INFO[0000] Created image volume k3d-test-images
INFO[0000] Starting new tools node...
INFO[0000] Starting Node 'k3d-test-tools'
INFO[0001] Creating node 'k3d-test-server-0'
INFO[0001] Creating LoadBalancer 'k3d-test-serverlb'
INFO[0001] Using the k3d-tools node to gather environment information
INFO[0001] Starting new tools node...
INFO[0001] Starting Node 'k3d-test-tools'
INFO[0003] Starting cluster 'test'
INFO[0003] Starting servers...
INFO[0003] Starting Node 'k3d-test-server-0'
ERRO[0003] Failed Cluster Start: Failed to start server k3d-test-server-0: Node k3d-test-server-0 failed to get ready: error waiting for log line `k3s is up and running` from node 'k3d-test-server-0': stopped returning log lines
ERRO[0003] Failed to create cluster >>> Rolling Back
INFO[0003] Deleting cluster 'test'
INFO[0004] Deleting cluster network 'k3d-test'
INFO[0004] Deleting 2 attached volumes...
WARN[0004] Failed to delete volume 'k3d-test-images' of cluster 'test': failed to find volume 'k3d-test-images': Error: No such volume: k3d-test-images -> Try to delete it manually
FATA[0004] Cluster creation FAILED, all changes have been rolled back!

fragolinux avatar Aug 26 '22 09:08 fragolinux

I've run into same issue before. I tried clean up all resources (image container volume network), then create cluster. Again and again, repeat it and finally succeed. But I don't known what happend. That's why I'm looking for some way to patch origin image.

maoxuner avatar Aug 26 '22 11:08 maoxuner

@pawmaster think i fixed it... take a look at my repo, i just left the paths as in original image (no /opt...), and image comes now up no problem... now let's see if nfs works :D

try: k3d cluster create test -i ghcr.io/ecomind/k3s-nfs:1.22.13-k3s1

fragolinux avatar Aug 26 '22 16:08 fragolinux

@fragolinux It's not a good practice to override alpine binaries with original image(scratch binaries) directly, there may be incompatible between binary files.

A better way is replace all binaries with alpine packages. I can't find packages including bin files such as /bin/aux/xtables. As a result of that, I copied all bin files to /opt/k3s/bin

Anyway, if it works, it's still a good idea.

By the way, do you know any method to backup and restore clusters (multiple nodes) created by k3d? I've tried to backup /var/lib/rancher/k3s/server/db(using sqlite by default), but new cluster can't restore from it.

maoxuner avatar Aug 27 '22 00:08 maoxuner

Hey I got NFS to work in k3d for GitHub codespace based on the info from this thread.


It's mostly the same as @marcoaraujojunior's commit https://github.com/marcoaraujojunior/k3s-docker/commit/914c6f84e0b086ba86b15806062771d9fae5c274 with touch /run/openrc/softlevel added to the entrypoint script and also figuring out to set export K3D_FIX_CGROUPV2=false so that the entrypoint isn't overridden.

Try with

export K3D_FIX_CGROUPV2=false
k3d cluster create -i ghcr.io/jlian/k3d-nfs:v1.25.3-k3s1

jlian avatar Jun 29 '23 23:06 jlian

@jlian , instead of disabling k3d's entrypoints (there are actually multiple), just add your script to the list by putting it here /bin/k3d-entrypoint-*.sh, replacing the * with the name of your script. This will make k3d execute it alongside the other entrypoint scripts. I hope that we can expose this more easily using the lifecycle hooks at some point.

iwilltry42 avatar Jun 30 '23 09:06 iwilltry42

@iwilltry42 Ok thanks, got it to work. Now just needs k3d cluster create -i ghcr.io/jlian/k3d-nfs:v1.25.3-k3s1

Took me a while to find the entrypoint logs in /var/log as opposed to docker container logs. Also noticed that this custom entrypoint method doesn't work on k3d v4 and older.

jlian avatar Jun 30 '23 19:06 jlian

I am not currently experiencing the issues @jlian is experiencing.

I have created a repository with the latest images from the 1.25, 1.26, and 1.27 channels, as well as the "stable" channel at https://github.com/ryan-mcd/k3s-containers

Feel free to utilize these images.

ryan-mcd avatar Jul 09 '23 18:07 ryan-mcd

OUaouh !!! thanks !!!

I lost 5 hours for my first test to try create a nfs share for my pods on synology !! and all was k3d fault !!!

+1 to update this, even if k3d purpose is mainly test, not have nfs for storage is a strange pain !

any idea to warn k3d company about that more drectly ?

Thks @jlian for the 1.25 image ; no 1.26 or + ? (and thx for gloomhaven, i'm starting by jaws of lion, maybe will try to adapt it for those scenarii !)

dcpc007 avatar Jul 27 '23 12:07 dcpc007

@ryan-mcd you got NFS to work in codespaces without using openrc? It's been a while but I kind of remember when I first tried it without openrc it kept not working. Can you show me which part in your Dockerfile that makes it work?

EDIT: hmm, I tried your image and it didn't work for me, getting FailedMount with my pod with MountVolume.SetUp failed for volume "pvc-***" : mount failed: exit status 32 and

│ Mounting command: mount                                        
│ Mounting arguments: -t nfs -o vers=3*** /var/lib/kubelet/pods/***/volumes/kubernetes.io~nfs/pvc-***│
│ Output: mount.nfs: rpc.statd is not running but is required for remote locking.
│ mount.nfs: Either use '-o nolock' to keep locks local, or start statd.

jlian avatar Nov 03 '23 01:11 jlian

OUaouh !!! thanks !!!

I lost 5 hours for my first test to try create a nfs share for my pods on synology !! and all was k3d fault !!!

+1 to update this, even if k3d purpose is mainly test, not have nfs for storage is a strange pain !

any idea to warn k3d company about that more drectly ?

Thks @jlian for the 1.25 image ; no 1.26 or + ? (and thx for gloomhaven, i'm starting by jaws of lion, maybe will try to adapt it for those scenarii !)

@dcpc007 there is no company behind k3d. There's SUSE Rancher behind K3s though, which is what's inside k3d, so feel free to open issues/PRs on https://github.com/k3s-io/k3s or ask them via Slack. Or if you want, you can try to come up with an automated workflow that builds k3d specific images of K3s that includes extra features (like nfs support) that may not be included upstream.

iwilltry42 avatar Nov 03 '23 05:11 iwilltry42

@ryan-mcd you got NFS to work in codespaces without using openrc? It's been a while but I kind of remember when I first tried it without openrc it kept not working. Can you show me which part in your Dockerfile that makes it work?

EDIT: hmm, I tried your image and it didn't work for me, getting FailedMount with my pod with MountVolume.SetUp failed for volume "pvc-***" : mount failed: exit status 32 and

│ Mounting command: mount                                        
│ Mounting arguments: -t nfs -o vers=3*** /var/lib/kubelet/pods/***/volumes/kubernetes.io~nfs/pvc-***│
│ Output: mount.nfs: rpc.statd is not running but is required for remote locking.
│ mount.nfs: Either use '-o nolock' to keep locks local, or start statd.

I don't use codespaces. Perhaps that's why I didn't have an issue without openrc. In my local environment it worked fine without it, so I didn't include it. I can certainly add it back.

Which version were you planning/attempting to use?

ryan-mcd avatar Nov 03 '23 11:11 ryan-mcd