usernetes icon indicating copy to clipboard operation
usernetes copied to clipboard

NFS provisioner fails

Open yspreen opened this issue 4 years ago • 3 comments

Hey, great project! I tried to get the helm nfs server provisioner running and ran into some roadblocks on a GCE centOS 8 machine.

First, just starting the helm chart with this config: (run through envsubst)

replicaCount: 1

image:
  repository: quay.io/kubernetes_incubator/nfs-provisioner
  tag: v2.2.1-k8s1.12
  pullPolicy: IfNotPresent

service:
  type: ClusterIP

  nfsPort: 2049
  mountdPort: 20048
  rpcbindPort: 51413
  externalIPs: []

persistence:
  enabled: true
  storageClass: "-"
  accessMode: ReadWriteOnce
  size: "100Gi"

storageClass:
  create: true
  defaultClass: false
  name: nfs
  allowVolumeExpansion: true
  parameters: {}

  mountOptions:
    - vers=4.1
    - noatime

  reclaimPolicy: Retain

rbac:
  create: true
  serviceAccountName: default

resources:
  {}
nodeSelector:
  kubernetes.io/hostname: ${nodename}
tolerations: []
affinity: {}

Afterwards, the necessary PV is created and a PVC is bound to nfs:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: "${nfspvcname}"
spec:
  capacity:
    storage: "100Gi"
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "${mainstoragepath}"
  claimRef:
    namespace: default
    name: "${nfspvcname}"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: default-pvc
spec:
  storageClassName: "nfs"
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: "90Gi"

$nfspvcname is set to the PVC created by NFS.

Now the pod for nfs crashes continously:

kubectl describe pod nfs-release-nfs-server-provisioner-0
[INFO] Entering RootlessKit namespaces: OK
Name:         nfs-release-nfs-server-provisioner-0
Namespace:    default
Node:         centos/10.0.2.100
Start Time:   Fri, 03 Jan 2020 17:49:16 +0000
Labels:       app=nfs-server-provisioner
              chart=nfs-server-provisioner-0.3.2
              controller-revision-hash=nfs-release-nfs-server-provisioner-79c9977558
              heritage=Helm
              release=nfs-release
              statefulset.kubernetes.io/pod-name=nfs-release-nfs-server-provisioner-0
Annotations:  <none>
Status:       Running
IP:           10.88.0.4
IPs:
  IP:           10.88.0.4
Controlled By:  StatefulSet/nfs-release-nfs-server-provisioner
Containers:
  nfs-server-provisioner:
    Container ID:  docker://8ce423f7c0df95d08a4c49531b0fd59d6a8e8d97afd9e7756a99a38e51b9736f
    Image:         quay.io/kubernetes_incubator/nfs-provisioner:v2.2.1-k8s1.12
    Image ID:      docker-pullable://quay.io/kubernetes_incubator/nfs-provisioner@sha256:f0f0d9d39f8aac4a2f39a1b0b602baa993bca0f22c982f208ca9d7a0d2b2399f
    Ports:         2049/TCP, 20048/TCP, 111/TCP, 111/UDP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/UDP
    Args:
      -provisioner=cluster.local/nfs-release-nfs-server-provisioner
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Fri, 03 Jan 2020 17:56:16 +0000
      Finished:     Fri, 03 Jan 2020 17:56:23 +0000
    Ready:          False
    Restart Count:  6
    Environment:
      POD_IP:          (v1:status.podIP)
      SERVICE_NAME:   nfs-release-nfs-server-provisioner
      POD_NAMESPACE:  default (v1:metadata.namespace)
    Mounts:
      /export from data (rw)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  data:
    Type:        PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:   data-nfs-release-nfs-server-provisioner-0
    ReadOnly:    false
QoS Class:       BestEffort
Node-Selectors:  kubernetes.io/hostname=centos
Tolerations:     <none>
Events:
  Type     Reason             Age                  From               Message
  ----     ------             ----                 ----               -------
  Warning  FailedScheduling   <unknown>            default-scheduler  error while running "VolumeBinding" filter plugin for pod "nfs-release-nfs-server-provisioner-0": pod has unbound immediate PersistentVolumeClaims
  Warning  FailedScheduling   <unknown>            default-scheduler  error while running "VolumeBinding" filter plugin for pod "nfs-release-nfs-server-provisioner-0": pod has unbound immediate PersistentVolumeClaims
  Normal   Scheduled          <unknown>            default-scheduler  Successfully assigned default/nfs-release-nfs-server-provisioner-0 to centos
  Normal   Pulling            11m                  kubelet, centos    Pulling image "quay.io/kubernetes_incubator/nfs-provisioner:v2.2.1-k8s1.12"
  Normal   Pulled             10m                  kubelet, centos    Successfully pulled image "quay.io/kubernetes_incubator/nfs-provisioner:v2.2.1-k8s1.12"
  Normal   Pulled             10m (x2 over 10m)    kubelet, centos    Container image "quay.io/kubernetes_incubator/nfs-provisioner:v2.2.1-k8s1.12" already present on machine
  Normal   Created            10m (x3 over 10m)    kubelet, centos    Created container nfs-server-provisioner
  Normal   Started            10m (x3 over 10m)    kubelet, centos    Started container nfs-server-provisioner
  Warning  BackOff            9m47s (x3 over 10m)  kubelet, centos    Back-off restarting failed container
  Warning  MissingClusterDNS  50s (x57 over 11m)   kubelet, centos    pod: "nfs-release-nfs-server-provisioner-0_default(ba5bd6f1-c4af-4f78-93f2-c9a48d126ba9)". kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to "Default" policy.

Since the error seems to be related to missing DNS services I tried to setup kube dns via https://github.com/kelseyhightower/kubernetes-the-hard-way/blob/master/docs/12-dns-addon.md

I had to change the IP from 10.32.0.10 to 10.0.0.10, but the dns pod also fails:

kubectl describe pod coredns-68567cdb47-78x67 --namespace kube-system
[INFO] Entering RootlessKit namespaces: OK
Name:                 coredns-68567cdb47-78x67
Namespace:            kube-system
Priority Class Name:  system-cluster-critical
Node:                 centos/10.0.2.100
Start Time:           Fri, 03 Jan 2020 17:48:54 +0000
Labels:               k8s-app=kube-dns
                      pod-template-hash=68567cdb47
Annotations:          <none>
Status:               Running
IP:                   10.88.0.3
IPs:
  IP:           10.88.0.3
Controlled By:  ReplicaSet/coredns-68567cdb47
Containers:
  coredns:
    Container ID:  docker://387906805acc0fff0f1bbf1e392e886e77c09363bbcce61720db4f316862aaa7
    Image:         coredns/coredns:1.6.2
    Image ID:      docker-pullable://coredns/coredns@sha256:12eb885b8685b1b13a04ecf5c23bc809c2e57917252fd7b0be9e9c00644e8ee5
    Ports:         53/UDP, 53/TCP, 9153/TCP
    Host Ports:    0/UDP, 0/TCP, 0/TCP
    Args:
      -conf
      /etc/coredns/Corefile
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Fri, 03 Jan 2020 17:59:48 +0000
      Finished:     Fri, 03 Jan 2020 17:59:48 +0000
    Ready:          False
    Restart Count:  7
    Limits:
      memory:  170Mi
    Requests:
      cpu:        100m
      memory:     70Mi
    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
    Readiness:    http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /etc/coredns from config-volume (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  config-volume:
    Type:        ConfigMap (a volume populated by a ConfigMap)
    Name:        coredns
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  beta.kubernetes.io/os=linux
Tolerations:     CriticalAddonsOnly
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  <unknown>             default-scheduler  Successfully assigned kube-system/coredns-68567cdb47-78x67 to centos
  Normal   Pulling    14m                   kubelet, centos    Pulling image "coredns/coredns:1.6.2"
  Normal   Pulled     14m                   kubelet, centos    Successfully pulled image "coredns/coredns:1.6.2"
  Normal   Created    13m (x5 over 14m)     kubelet, centos    Created container coredns
  Normal   Started    13m (x5 over 14m)     kubelet, centos    Started container coredns
  Normal   Pulled     13m (x4 over 14m)     kubelet, centos    Container image "coredns/coredns:1.6.2" already present on machine
  Warning  BackOff    4m51s (x50 over 14m)  kubelet, centos    Back-off restarting failed container

Stderr of run.sh:

[kubelet-dockershim] E0103 18:03:48.735562     114 container_manager_linux.go:477] cpu and memory cgroup hierarchy not unified.  cpu: /, memory: /user.slice/user-1000.slice/session-1.scope
[kubelet-dockershim] E0103 18:03:48.867398     114 container_manager_linux.go:101] Unable to ensure the docker processes run in the desired containers: errors moving "dockerd" pid: failed to apply oom score -999 to PID 85: write /proc/85/oom_score_adj: permission denied
...
[dockerd] time="2020-01-03T18:04:56.269919903Z" level=info msg="shim containerd-shim started" address=/containerd-shim/6ba264bcbcc738a4686c6b6bbc36cd4c96cbd3a5ff04b2a14b4064f48779d088.sock debug=false pid=7150
[dockerd] time="2020-01-03T18:04:56.903241134Z" level=info msg="shim reaped" id=d023513cf1411c90f83bf1a30d2d55a3e3cde300d59dea05f2ecafea011be2e1
[dockerd] time="2020-01-03T18:04:56.913507529Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
[dockerd] time="2020-01-03T18:05:05.238881369Z" level=info msg="shim containerd-shim started" address=/containerd-shim/1a5da1ec06ca697778b35df31d55a6f5befe987f6a82b138b3def79ccab895eb.sock debug=false pid=7269
[dockerd] time="2020-01-03T18:05:05.601143797Z" level=info msg="shim reaped" id=e6eebcd70064eed6d293f505e98c9e6a3d2a682213a281cdfdebecf245b2f3fa
[dockerd] time="2020-01-03T18:05:05.611475434Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
...
[kubelet-dockershim] E0103 18:05:06.218907     114 pod_workers.go:191] Error syncing pod 10011dcf-1fcc-4f90-9692-c27225bfb393 ("coredns-68567cdb47-78x67_kube-system(10011dcf-1fcc-4f90-9692-c27225bfb393)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "back-off 5m0s restarting failed container=coredns pod=coredns-68567cdb47-78x67_kube-system(10011dcf-1fcc-4f90-9692-c27225bfb393)"
[kubelet-dockershim] E0103 18:05:06.595575     114 pod_workers.go:191] Error syncing pod 9ff213d8-36e9-448c-9675-8f344be436fc ("coredns-68567cdb47-xvxnv_kube-system(9ff213d8-36e9-448c-9675-8f344be436fc)"), skipping: failed to "StartContainer" for "coredns" with CrashLoopBackOff: "back-off 5m0s restarting failed container=coredns pod=coredns-68567cdb47-xvxnv_kube-system(9ff213d8-36e9-448c-9675-8f344be436fc)"
[kubelet-dockershim] E0103 18:06:14.151994     114 pod_workers.go:191] Error syncing pod ba5bd6f1-c4af-4f78-93f2-c9a48d126ba9 ("nfs-release-nfs-server-provisioner-0_default(ba5bd6f1-c4af-4f78-93f2-c9a48d126ba9)"), skipping: failed to "StartContainer" for "nfs-server-provisioner" with CrashLoopBackOff: "back-off 5m0s restarting failed container=nfs-server-provisioner pod=nfs-release-nfs-server-provisioner-0_default(ba5bd6f1-c4af-4f78-93f2-c9a48d126ba9)"

yspreen avatar Jan 03 '20 18:01 yspreen

Regardless to the DNS issue, an unprivileged user can't mount NFS currently. Probably we need to have some privileged helper daemon for persistent volume stuff. Or maybe we can use some FUSE implementation of NFS.

cc @giuseppe @rhatdan

AkihiroSuda avatar Jan 06 '20 04:01 AkihiroSuda

W.r.t. DNS issue, maybe you should try Rootless mode of k3s. It is based on Usernetes but includes DNS stuff by default.

AkihiroSuda avatar Jan 06 '20 04:01 AkihiroSuda

I would say the best bed for this is a fuse based nfs, since it is not likely that user namespace root is going to be allowed to mount an nfs share any time soon. Also NFS and User Namespace does not work well together if you are going to have multiple a process changing uids inside of an environment. Having UID 1234 chowning a file to UID 5678 is blocked on the server side inside of a user namespace. NFS enforces at the server side and has no concept of USERNS CAP_CHOWN or CAP_DAC_OVERRIDE.

Another potential option would be to setup automounter then the host kernel could mount directories on demand when a containerrized process entered the mount point.

rhatdan avatar Jan 06 '20 14:01 rhatdan