k3d
k3d copied to clipboard
[BUG] cannot mount nfs shares from inside pods
What did you do
i initially tested openebs nfs-provisioner, on top of k3d default local-path storage class... pvc where created, but pods could not mount them, saying "not permitted" or "not supported"... i could mount the shares from inside the openebs nfs sharing pods, even between them (a pod could mount its own shares, AND the shares of the other pod, sharing a different pvc)... but NO OTHER pods could mount them, they all remain in containerCreating state, and i've those errors in events...
so i tried a different solution, an nfs server docker container running on my host machine, and connect to it using the nfs subdir provisioner, with identical results, so it seems i cannot get an RWX volume on k3d right now, whatever solution i do... tested on both my dev machine (macbook pro, big sur latest) AND on an ubuntu 22.04 vm (with of course the nfs-common package installed on it)
- How was the cluster created?
docker network create --subnet="172.22.0.0/16" --gateway="172.22.0.1" "internalNetwork"
k3d cluster create test --network internalNetwork
mkdir -p ~/nfsshare
docker run -d --net=internalNetwork -p 2049:2049 --name nfs --privileged -v ~/nfsshare:/nfsshare -e SHARED_DIRECTORY=/nfsshare itsthenetwork/nfs-server-alpine:latest
helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner --set nfs.server=host.k3d.internal --set nfs.path=/
the pod stays in "containerCreating" state, and in events i get:
Message: MountVolume.SetUp failed for volume "nfs-subdir-external-provisioner-root" : mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t nfs host.k3d.internal:/ /var/lib/kubelet/pods/ddeba612-5e1c-4ae6-8068-f641f42706ca/volumes/kubernetes.io~nfs/nfs-subdir-external-provisioner-root
Output: mount: mounting host.k3d.internal:/ on /var/lib/kubelet/pods/ddeba612-5e1c-4ae6-8068-f641f42706ca/volumes/kubernetes.io~nfs/nfs-subdir-external-provisioner-root failed: Not supported
so, let's try from an ubuntu pod
kubectl run ubuntu --image ubuntu sleep infinity
# shell inside, then:
apt update
apt install nfs-common -y
mkdir t
mount -t nfs host.k3d.internal:/ t # host is correctly resolved using dig...
mount.nfs: Operation not permitted
test from host to see if the share works: it does...
mkdir ~/nfsshare
sudo mount -t nfs localhost:/ t0
touch t0/aaa
ls t0 # aaa exists
ls ~/nfsshare # and is visible in the share
touch ~/nfsshare/bbb
ls ~/nfsshare # bbb exists in share
ls t0 # and is visible in local folder
What did you expect to happen
share should be mountable, to create rwx volumes
Which OS & Architecture
- output of
k3d runtime-info
arch: x86_64
cgroupdriver: systemd
cgroupversion: "2"
endpoint: /var/run/docker.sock
filesystem: extfs
name: docker
os: Ubuntu 22.04 LTS
ostype: linux
version: 20.10.12
Which version of k3d
- output of
k3d version
k3d version v5.4.4
k3s version v1.23.8-k3s1 (default)
Which version of docker
- output of
docker version
anddocker info
Client:
Version: 20.10.12
API version: 1.41
Go version: go1.17.3
Git commit: 20.10.12-0ubuntu4
Built: Mon Mar 7 17:10:06 2022
OS/Arch: linux/amd64
Context: default
Experimental: true
Server:
Engine:
Version: 20.10.12
API version: 1.41 (minimum version 1.12)
Go version: go1.17.3
Git commit: 20.10.12-0ubuntu4
Built: Mon Mar 7 15:57:50 2022
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v1.6.6
GitCommit: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
runc:
Version: 1.1.3
GitCommit: v1.1.3-0-g6724737f
docker-init:
Version: 0.19.0
GitCommit:
Client:
Context: default
Debug Mode: false
Server:
Containers: 5
Running: 5
Paused: 0
Stopped: 0
Images: 24
Server Version: 20.10.12
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runtime.v1.linux runc io.containerd.runc.v2
Default Runtime: runc
Init Binary: docker-init
containerd version: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1
runc version: v1.1.3-0-g6724737f
init version:
Security Options:
apparmor
seccomp
Profile: default
cgroupns
Kernel Version: 5.15.0-41-generic
Operating System: Ubuntu 22.04 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 3.827GiB
Name: lima-ubuntu
ID: EMKE:7WSJ:7M3C:3JFT:HS7J:MNZJ:SQKG:Y3RY:2S7Q:PMMO:DHRL:6K3P
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
update: tried ganesha nfs provisioner, too, same setup as above... even thats, fails to create usable nfs shares... my pods are now in a different state (CreateContainerConfigError), and i get this in events...
Message: MountVolume.SetUp failed for volume "pvc-8352d87c-c342-4777-892f-ef94f02d8ded" : mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t nfs -o vers=4 10.43.8.160:/export/pvc-8352d87c-c342-4777-892f-ef94f02d8ded /var/lib/kubelet/pods/b56c8042-25aa-4d7b-9045-2b0827c03c8d/volumes/kubernetes.io~nfs/pvc-8352d87c-c342-4777-892f-ef94f02d8ded
Output: mount: mounting 10.43.8.160:/export/pvc-8352d87c-c342-4777-892f-ef94f02d8ded on /var/lib/kubelet/pods/b56c8042-25aa-4d7b-9045-2b0827c03c8d/volumes/kubernetes.io~nfs/pvc-8352d87c-c342-4777-892f-ef94f02d8ded failed: Stale file handle
all the other stuff is the same, the storageclass is always "nfs", this is my helmrelease:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: ${release}
namespace: flux-system
spec:
chart:
spec:
chart: nfs-server-provisioner # nfs-provisioner
version: "1.4.0" # "0.9.0"
sourceRef:
kind: HelmRepository
name: ${release}
interval: 1h0m0s
releaseName: ${release}
targetNamespace: ${namespace}
values:
# see https://artifacthub.io/packages/helm/kvaps/nfs-server-provisioner
persistence:
enabled: true
storageClass: "local-path"
size: 1Gi
storageClass:
defaultClass: false
name: nfs
reclaimPolicy: Retain
mountOptions:
- "vers=4"
i just tried even the rook nfs provisioner, that too does not work on k3d 5.4.4 with errors:
Unable to attach or mount volumes: unmounted volumes=[rook-nfs-vol], unattached volumes=[rook-nfs-vol kube-api-access-tv299]: timed out waiting for the condition
Message: MountVolume.SetUp failed for volume "pvc-8352d87c-c342-4777-892f-ef94f02d8ded" : mount failed: exit status 255
Mounting command: mount
Mounting arguments: -t nfs -o vers=4 10.43.8.160:/export/pvc-8352d87c-c342-4777-892f-ef94f02d8ded /var/lib/kubelet/pods/a068abfc-8347-4f76-87b2-f9269b37c0db/volumes/kubernetes.io~nfs/pvc-8352d87c-c342-4777-892f-ef94f02d8ded
Output: mount: mounting 10.43.8.160:/export/pvc-8352d87c-c342-4777-892f-ef94f02d8ded on /var/lib/kubelet/pods/a068abfc-8347-4f76-87b2-f9269b37c0db/volumes/kubernetes.io~nfs/pvc-8352d87c-c342-4777-892f-ef94f02d8ded failed: Stale file handle
pvc are regularly created and bound, but pods cannot mount them and write...
@fragolinux
It's not a bug of k3d but a defect of k3s docker image.
K3s docker image is build from scratch with no nfs support.Dockerfile
As as result of that, both k3s node container and pods inside of the node can not mount nfs.
There is a workaround: rebase k3s image with alpine and install nfs-utils.
FROM alpine:latest
RUN set -ex; \
apk add --no-cache iptables ip6tables nfs-utils; \
echo 'hosts: files dns' > /etc/nsswitch.conf
COPY --from=rancher/k3s:v1.24.3-k3s1 /bin /opt/k3s/bin
VOLUME /var/lib/kubelet
VOLUME /var/lib/rancher/k3s
VOLUME /var/lib/cni
VOLUME /var/log
ENV PATH="$PATH:/opt/k3s/bin:/opt/k3s/bin/aux"
ENV CRI_CONFIG_FILE="/var/lib/rancher/k3s/agent/etc/crictl.yaml"
ENTRYPOINT ["/opt/k3s/bin/k3s"]
CMD ["agent"]
Build it yourself or have a look at mine. maoxuner/k3s (not managed frequently)
I don't known how to patch nfs-utils into official k3s image. Anyone know it please tell me.
@pawmaster tried that, but didn't work for me... hints?
k3d cluster create test -i maoxuner/k3s:v1.24.3-k3s1
INFO[0000] Prep: Network
INFO[0000] Created network 'k3d-test'
INFO[0000] Created image volume k3d-test-images
INFO[0000] Starting new tools node...
INFO[0000] Starting Node 'k3d-test-tools'
INFO[0001] Creating node 'k3d-test-server-0'
INFO[0001] Creating LoadBalancer 'k3d-test-serverlb'
INFO[0001] Using the k3d-tools node to gather environment information
INFO[0001] Starting new tools node...
INFO[0001] Starting Node 'k3d-test-tools'
INFO[0003] Starting cluster 'test'
INFO[0003] Starting servers...
INFO[0003] Starting Node 'k3d-test-server-0'
ERRO[0003] Failed Cluster Start: Failed to start server k3d-test-server-0: Node k3d-test-server-0 failed to get ready: error waiting for log line `k3s is up and running` from node 'k3d-test-server-0': stopped returning log lines
ERRO[0003] Failed to create cluster >>> Rolling Back
INFO[0003] Deleting cluster 'test'
INFO[0004] Deleting cluster network 'k3d-test'
INFO[0004] Deleting 2 attached volumes...
WARN[0004] Failed to delete volume 'k3d-test-images' of cluster 'test': failed to find volume 'k3d-test-images': Error: No such volume: k3d-test-images -> Try to delete it manually
FATA[0004] Cluster creation FAILED, all changes have been rolled back!
I created a similar image, based on the version i need (1.22) and have same issues... something missing in dockerfile?
k3d cluster create test -i ghcr.io/ecomind/k3s-nfs:1.22.12-k3s1
INFO[0000] Prep: Network
INFO[0000] Created network 'k3d-test'
INFO[0000] Created image volume k3d-test-images
INFO[0000] Starting new tools node...
INFO[0000] Starting Node 'k3d-test-tools'
INFO[0001] Creating node 'k3d-test-server-0'
INFO[0001] Creating LoadBalancer 'k3d-test-serverlb'
INFO[0001] Using the k3d-tools node to gather environment information
INFO[0001] Starting new tools node...
INFO[0001] Starting Node 'k3d-test-tools'
INFO[0003] Starting cluster 'test'
INFO[0003] Starting servers...
INFO[0003] Starting Node 'k3d-test-server-0'
ERRO[0003] Failed Cluster Start: Failed to start server k3d-test-server-0: Node k3d-test-server-0 failed to get ready: error waiting for log line `k3s is up and running` from node 'k3d-test-server-0': stopped returning log lines
ERRO[0003] Failed to create cluster >>> Rolling Back
INFO[0003] Deleting cluster 'test'
INFO[0004] Deleting cluster network 'k3d-test'
INFO[0004] Deleting 2 attached volumes...
WARN[0004] Failed to delete volume 'k3d-test-images' of cluster 'test': failed to find volume 'k3d-test-images': Error: No such volume: k3d-test-images -> Try to delete it manually
FATA[0004] Cluster creation FAILED, all changes have been rolled back!
I've run into same issue before. I tried clean up all resources (image container volume network), then create cluster. Again and again, repeat it and finally succeed. But I don't known what happend. That's why I'm looking for some way to patch origin image.
@pawmaster think i fixed it... take a look at my repo, i just left the paths as in original image (no /opt...), and image comes now up no problem... now let's see if nfs works :D
try: k3d cluster create test -i ghcr.io/ecomind/k3s-nfs:1.22.13-k3s1
@fragolinux It's not a good practice to override alpine binaries with original image(scratch binaries) directly, there may be incompatible between binary files.
A better way is replace all binaries with alpine packages. I can't find packages including bin files such as /bin/aux/xtables
. As a result of that, I copied all bin files to /opt/k3s/bin
Anyway, if it works, it's still a good idea.
By the way, do you know any method to backup and restore clusters (multiple nodes) created by k3d? I've tried to backup /var/lib/rancher/k3s/server/db
(using sqlite by default), but new cluster can't restore from it.
Hey I got NFS to work in k3d for GitHub codespace based on the info from this thread.
https://github.com/jlian/k3d-nfs
It's mostly the same as @marcoaraujojunior's commit https://github.com/marcoaraujojunior/k3s-docker/commit/914c6f84e0b086ba86b15806062771d9fae5c274 with touch /run/openrc/softlevel
added to the entrypoint script and also figuring out to set export K3D_FIX_CGROUPV2=false
so that the entrypoint isn't overridden.
Try with
export K3D_FIX_CGROUPV2=false
k3d cluster create -i ghcr.io/jlian/k3d-nfs:v1.25.3-k3s1
@jlian , instead of disabling k3d's entrypoints (there are actually multiple), just add your script to the list by putting it here /bin/k3d-entrypoint-*.sh
, replacing the *
with the name of your script.
This will make k3d execute it alongside the other entrypoint scripts.
I hope that we can expose this more easily using the lifecycle hooks at some point.
@iwilltry42 Ok thanks, got it to work. Now just needs k3d cluster create -i ghcr.io/jlian/k3d-nfs:v1.25.3-k3s1
Took me a while to find the entrypoint logs in /var/log
as opposed to docker container logs. Also noticed that this custom entrypoint method doesn't work on k3d v4 and older.
I am not currently experiencing the issues @jlian is experiencing.
I have created a repository with the latest images from the 1.25, 1.26, and 1.27 channels, as well as the "stable" channel at https://github.com/ryan-mcd/k3s-containers
Feel free to utilize these images.
OUaouh !!! thanks !!!
I lost 5 hours for my first test to try create a nfs share for my pods on synology !! and all was k3d fault !!!
+1 to update this, even if k3d purpose is mainly test, not have nfs for storage is a strange pain !
any idea to warn k3d company about that more drectly ?
Thks @jlian for the 1.25 image ; no 1.26 or + ? (and thx for gloomhaven, i'm starting by jaws of lion, maybe will try to adapt it for those scenarii !)
@ryan-mcd you got NFS to work in codespaces without using openrc? It's been a while but I kind of remember when I first tried it without openrc it kept not working. Can you show me which part in your Dockerfile that makes it work?
EDIT: hmm, I tried your image and it didn't work for me, getting FailedMount
with my pod with MountVolume.SetUp failed for volume "pvc-***" : mount failed: exit status 32
and
│ Mounting command: mount
│ Mounting arguments: -t nfs -o vers=3 10.43.210.51:/export/pvc-*** /var/lib/kubelet/pods/***/volumes/kubernetes.io~nfs/pvc-***│
│ Output: mount.nfs: rpc.statd is not running but is required for remote locking.
│ mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
OUaouh !!! thanks !!!
I lost 5 hours for my first test to try create a nfs share for my pods on synology !! and all was k3d fault !!!
+1 to update this, even if k3d purpose is mainly test, not have nfs for storage is a strange pain !
any idea to warn k3d company about that more drectly ?
Thks @jlian for the 1.25 image ; no 1.26 or + ? (and thx for gloomhaven, i'm starting by jaws of lion, maybe will try to adapt it for those scenarii !)
@dcpc007 there is no company behind k3d. There's SUSE Rancher behind K3s though, which is what's inside k3d, so feel free to open issues/PRs on https://github.com/k3s-io/k3s or ask them via Slack. Or if you want, you can try to come up with an automated workflow that builds k3d specific images of K3s that includes extra features (like nfs support) that may not be included upstream.
@ryan-mcd you got NFS to work in codespaces without using openrc? It's been a while but I kind of remember when I first tried it without openrc it kept not working. Can you show me which part in your Dockerfile that makes it work?
EDIT: hmm, I tried your image and it didn't work for me, getting
FailedMount
with my pod withMountVolume.SetUp failed for volume "pvc-***" : mount failed: exit status 32
and│ Mounting command: mount │ Mounting arguments: -t nfs -o vers=3 10.43.210.51:/export/pvc-*** /var/lib/kubelet/pods/***/volumes/kubernetes.io~nfs/pvc-***│ │ Output: mount.nfs: rpc.statd is not running but is required for remote locking. │ mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
I don't use codespaces. Perhaps that's why I didn't have an issue without openrc. In my local environment it worked fine without it, so I didn't include it. I can certainly add it back.
Which version were you planning/attempting to use?