nfs-subdir-external-provisioner icon indicating copy to clipboard operation
nfs-subdir-external-provisioner copied to clipboard

NFS folder created but no SUCCESS

Open AlanObject opened this issue 3 years ago • 2 comments

I am attempting to run the test after installing the provisioner using the off-the-shelf scripts in this repository. On the NFS server, the provisioned directory is created (suggesting that the NFS linkage is at least set up correctly) but the test pod can't mount it.

After applying the test-claim.yaml file, the following in the log of the provisioner.

I0425 03:22:56.596241       1 controller.go:1317] provision "default/test-claim" class "nfs-client": started
I0425 03:22:56.606182       1 event.go:278] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"test-claim", UID:"ce28631b-dbec-4748-8d44-ee938fb7cb70", APIVersion:"v1", ResourceVersion:"6355819", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/test-claim"
I0425 03:22:57.002929       1 controller.go:1420] provision "default/test-claim" class "nfs-client": volume "pvc-ce28631b-dbec-4748-8d44-ee938fb7cb70" provisioned
I0425 03:22:57.003010       1 controller.go:1437] provision "default/test-claim" class "nfs-client": succeeded
I0425 03:22:57.003026       1 volume_store.go:212] Trying to save persistentvolume "pvc-ce28631b-dbec-4748-8d44-ee938fb7cb70"
I0425 03:22:57.019197       1 volume_store.go:219] persistentvolume "pvc-ce28631b-dbec-4748-8d44-ee938fb7cb70" saved
I0425 03:22:57.019305       1 event.go:278] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"test-claim", UID:"ce28631b-dbec-4748-8d44-ee938fb7cb70", APIVersion:"v1", ResourceVersion:"6355819", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-ce28631b-dbec-4748-8d44-ee938fb7cb70

I don't see any problem. On the NFS system I get:

# tree
.
└── default-test-claim-pvc-ce28631b-dbec-4748-8d44-ee938fb7cb70

1 directory, 0 files

So the directory gets created, but when the test-pod.yaml file is applied, the test pod doesn't ever get it mounted as an NFS volume. Its status:

$ kubectl get pod
NAME                                     READY   STATUS              RESTARTS      AGE
nfs-client-provisioner-7dd4bbdb4-xfzl7   1/1     Running             0             141m
test-pod                                 0/1     ContainerCreating   0             5m34s
$ kubectl describe pod/test-pod
Name:         test-pod
Namespace:    default
Priority:     0
Node:         carbon/64.71.145.125
Start Time:   Mon, 25 Apr 2022 03:23:13 +0000
Labels:       <none>
Annotations:  <none>
Status:       Pending
IP:           
IPs:          <none>
Containers:
  test-pod:
    Container ID:  
    Image:         busybox:stable
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sh
    Args:
      -c
      touch /mnt/SUCCESS && exit 0 || exit 1
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /mnt from nfs-pvc (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pjb8t (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  nfs-pvc:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  test-claim
    ReadOnly:   false
  kube-api-access-pjb8t:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason       Age                  From     Message
  ----     ------       ----                 ----     -------
  Warning  FailedMount  56s (x2 over 3m58s)  kubelet  MountVolume.SetUp failed for volume "pvc-ce28631b-dbec-4748-8d44-ee938fb7cb70" : mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t nfs 10.3.243.101:/ifs/kubernetes/default-test-claim-pvc-ce28631b-dbec-4748-8d44-ee938fb7cb70 /var/snap/microk8s/common/var/lib/kubelet/pods/26fdd39b-eef1-4abb-b547-68b0bbfdb88f/volumes/kubernetes.io~nfs/pvc-ce28631b-dbec-4748-8d44-ee938fb7cb70
Output: mount.nfs: Connection timed out
  Warning  FailedMount  26s (x3 over 4m56s)  kubelet  Unable to attach or mount volumes: unmounted volumes=[nfs-pvc], unattached volumes=[nfs-pvc kube-api-access-pjb8t]: timed out waiting for the condition

As is obvious, the attempted mount fails. When I first saw this I learned that when using the provisioner the pod does not connect to the NFS server directly. That however makes the problem more opaque -- where would the pod get a wrong mount path name? My NFS server seems to have no role in that.

BTW when I kill the test pod and claim, the directory gets renamed to "archive-*" as per documentation.

I'm a little stuck as to what to look into or try next. I haven't seen anyone else post an issue with this problem. I know everyone is busy but any help or pointers are much appreciated.

AlanObject avatar Apr 25 '22 03:04 AlanObject

If your NFS server is only accessible via the v4 protocol, then make sure you have that set in the mountOptions for the storage class:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs-client
 provisioner: k8s-sigs.io/nfs-subdir-external-provisioner
parameters:
  archiveOnDelete: "false"
mountOptions:
  - nfsvers=4

scmmmh avatar May 11 '22 09:05 scmmmh

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Aug 09 '22 09:08 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot avatar Sep 08 '22 10:09 k8s-triage-robot

I have built an alternative solution that does not involve NFS. Although the issue is not resolved there seems to be little point in leaving this thread open.

Other note: NFS is as slow.

AlanObject avatar Oct 07 '22 22:10 AlanObject