kubernetes-zfs-provisioner
kubernetes-zfs-provisioner copied to clipboard
Containerized SSH-less provisioner
Hi,
The documentation states:
Making a container image and creating ZFS datasets from a container is not exactly easy, as ZFS runs in kernel. While it's possible to pass /dev/zfs to a container so it can create and destroy datasets within the container, sharing the volume with NFS does not work.
Setting sharenfs property to anything other than off invokes exportfs(8), that requires also running the NFS Server to reload its exports. Which is not the case in a container (see zfs(8)).
But most importantly: Mounting /dev/zfs inside the provisioner container would mean that the datasets will only be created on the same host as the container currently runs.
So, in order to "break out" of the container the zfs calls are wrapped and redirected to another host over SSH. This requires SSH private keys to be mounted in the container for a SSH user with sufficient permissions to run zfs commands on the target host.
I spent some time working on a small proof of concept that shows it is possible to create ZFS dataset from within a container and have the volumes shared with NFS by the container. Also, the volume mounts are visible by both the host and the container, making it shareable using HostPath
.
I'm using this Dockerfile:
FROM docker.io/library/alpine:3.20 as runtime
ENTRYPOINT ["/entrypoint.sh"]
RUN apk add bash zfs nfs-utils
COPY kubernetes-zfs-provisioner /usr/bin/
COPY entrypoint.sh /
With this entrypoint.sh
:
#!/bin/sh
rpcbind
rpc.statd --no-notify --port 32765 --outgoing-port 32766
rpc.mountd --port 32767
rpc.idmapd
rpc.nfsd --tcp --udp --port 2049 8
exec /usr/bin/kubernetes-zfs-provisioner
The secret sauce is to use mountPropagation: Bidirectional
for the dataset volume mount, so each dataset mounted by the container is also visible in the host and vice versa:
apiVersion: apps/v1
kind: Deployment
metadata:
name: zfs-provisioner
namespace: zfs-system
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: zfs-provisioner
template:
metadata:
labels:
app.kubernetes.io/name: zfs-provisioner
namespace: zfs-system
spec:
serviceAccountName: zfs-provisioner
containers:
- name: provisionner
image: jp39/zfs:latest
volumeMounts:
- name: dev-zfs
mountPath: /dev/zfs
- name: dataset
mountPath: /tank/kubernetes
mountPropagation: Bidirectional
securityContext:
privileged: true
procMount: Unmasked
ports:
- containerPort: 2049
protocol: TCP
- containerPort: 111
protocol: UDP
- containerPort: 32765
protocol: UDP
- containerPort: 32767
protocol: UDP
env:
- name: ZFS_NFS_HOSTNAME
valueFrom:
fieldRef:
fieldPath: status.podIP
volumes:
- name: dev-zfs
hostPath:
path: /dev/zfs
- name: dataset
hostPath:
path: /tank/kubernetes
nodeSelector:
kubernetes.io/hostname: zfsnode
Note that I had to make a small patch within kubernetes-zfs-provisioner
so that the pod IP address (contained in the ZFS_NFS_HOSTNAME
environment variable) gets used a the NFSVolumeSource
's server address instead of the storage class's
hostname
parameter.
Is this something that would be worth having as a default configuration? It needs the ZFS host to be part of the cluster, but has the advantage not to require extra setup such as SSH keys.