dynamic-nfs-provisioner
dynamic-nfs-provisioner copied to clipboard
Backend NFS Deployment/Service/PVC/PV are removed before kubelet unmount the volume clearly
Describe the bug: A clear and concise description of what the bug is. Sometimes when removing a Pod, which is mounted with a NFS PV, with the corresponding NFS PVC/PV simultaneously, both the Pod/PVC/PV and the backend NFS Deployment/Service/PVC/PV are cleaned so fast that the kubelet on the worker node where the pod used to run can not unmount the NFS volume in time. This makes the remaining NFS volume on the worker node stale and won't be unmounted unless manually doing so. But the IO process will be blocked there forever until rebooting the node.
It's weird though that the Pod object is successfully removed from the cluster even without kubelet completing cleaning mount on the node.
Expected behaviour: A concise description of what you expected to happen The NFS volume mounted on the worker node is cleaned up.
Steps to reproduce the bug: Steps to reproduce the bug should be clear and easily reproducible to help people gain an understanding of the problem
- Define a Pod requiring NFS PV
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: openebs-nfs
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 2Gi
storageClassName: network-file # This is the SC name related to the openebs-nfs-provisioner
---
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: sleep
name: sleep
spec:
containers:
- image: nginx
name: sleep
resources: {}
volumeMounts:
- name: openebs-nfs
mountPath: /mnt
dnsPolicy: ClusterFirst
terminationGracePeriodSeconds: 0 # intentionally set this to 0
restartPolicy: Always
volumes:
- name: openebs-nfs
persistentVolumeClaim:
claimName: openebs-nfs
status: {}
Set the terminationGracePeriodSeconds
to 0 so the pod can be quickly removed when deleting it.
- Deploy above things and wait all things are up including those backend NFS Deployment/Service/PVC/PV
kubectl -n kube-system get all | grep nfs-pvc
pod/nfs-pvc-9226622c-10b0-4b1d-8d4d-5661c6fec8e3-7cfc9fdc76-x6746 1/1 Running 0 97s
service/nfs-pvc-9226622c-10b0-4b1d-8d4d-5661c6fec8e3 ClusterIP 10.105.148.166 <none> 2049/TCP,111/TCP 96s
deployment.apps/nfs-pvc-9226622c-10b0-4b1d-8d4d-5661c6fec8e3 1/1 1 1 97s
replicaset.apps/nfs-pvc-9226622c-10b0-4b1d-8d4d-5661c6fec8e3-7cfc9fdc76 1 1 1 97s
- Use
kubectl get po - o wide
to get the node where the Pod is running
kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
sleep 1/1 Running 0 2m11s 192.168.171.133 node-10-158-36-65 <none> <none>
- Delete above things at the same time via like
kubectl delete -f <path_file_of_above_content>
kubectl delete -f pod.yml
persistentvolumeclaim "openebs-nfs" deleted
pod "sleep" deleted
- Everything from the
kubectl
's view will be successfully removed
kubectl get po
No resources found in default namespace.
kubectl -n kube-system get all | grep nfs-pvc
- Go the node where the Pod ran and do
df -h
which will get stuck. Then viamount
will see the NFS volume is leftover
# ssh to the node
mount | grep nfs
10.105.148.166:/ on /var/lib/kubelet/pods/947b2765-78f0-4908-8856-5fe09269999e/volumes/kubernetes.io~nfs/pvc-9226622c-10b0-4b1d-8d4d-5661c6fec8e3 type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.158.36.65,local_lock=none,addr=10.105.148.166)
The output of the following commands will help us better understand what's going on:
-
kubectl get pods -n <openebs_namespace> --show-labels
-
kubectl get pvc -n <openebs_namespace>
-
kubectl get pvc -n <application_namespace>
Anything else we need to know?: Add any other context about the problem here.
Environment details:
- OpenEBS version (use
kubectl get po -n openebs --show-labels
):
v0.9.0
- Kubernetes version (use
kubectl version
):
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-12T14:18:45Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"49499222b0eb0349359881bea01d8d5bd78bf444", GitTreeState:"clean", BuildDate:"2021-12-14T12:41:40Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
- Cloud provider or hardware configuration:
- OS (e.g:
cat /etc/os-release
):
NAME="SLES"
VERSION="15-SP3"
VERSION_ID="15.3"
PRETTY_NAME="SUSE Linux Enterprise Server 15 SP3"
ID="sles"
ID_LIKE="suse"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:15:sp3"
DOCUMENTATION_URL="https://documentation.suse.com/"
- kernel (e.g:
uname -a
):
Linux node-10-158-36-65 5.3.18-57-default #1 SMP Wed Apr 28 10:54:41 UTC 2021 (ba3c2e9) x86_64 x86_64 x86_64 GNU/Linux
- others:
The backend storage is Ceph CSI RBD.
StorageClass:
k get sc network-file -o yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
cas.openebs.io/config: |
- name: NFSServerType
value: kernel
- name: BackendStorageClass
value: network-block
- name: LeaseTime
value: 30
- name: GraceTime
value: 30
kubectl.kubernetes.io/last-applied-configuration: |
{"allowVolumeExpansion":true,"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{"cas.openebs.io/config":"- name: NFSServerType\n value: kernel\n- name: BackendStorageClass\n value: network-block\n- name: LeaseTime\n value: 30\n- name: GraceTime\n value: 30\n","openebs.io/cas-type":"nfsrwx"},"labels":{"addonmanager.kubernetes.io/mode":"Reconcile"},"name":"network-file"},"provisioner":"openebs.io/nfsrwx","reclaimPolicy":"Delete","volumeBindingMode":"Immediate"}
openebs.io/cas-type: nfsrwx
storageclass.kubernetes.io/is-default-class: "true"
creationTimestamp: "2022-05-16T21:12:31Z"
labels:
addonmanager.kubernetes.io/mode: Reconcile
name: network-file
resourceVersion: "3104"
selfLink: /apis/storage.k8s.io/v1/storageclasses/network-file
uid: 1a02778d-391f-4e70-a9f1-cd3c7ad230da
provisioner: openebs.io/nfsrwx
reclaimPolicy: Delete
volumeBindingMode: Immediate
Got the same issue, with ISCSI backend storage. k8s make unmount only once and when it gets timoeut it just forgots about it. k8s version is 1.21, @jiuchen1986 did you solve this problem?
This sounds more like a problem or inconvenience in the k8s behaviour. I'm not sure if k8s does a lazy unmount when pod goes away, or if there is a way to specify that. Taking a look...