Flatcar icon indicating copy to clipboard operation
Flatcar copied to clipboard

NFSv3 mount hangs in 3510.2.1 NFSv4.1 works fine

Open kthommandra opened this issue 1 year ago • 6 comments

Description

sudo mount -t nfs -overs=3 NFSSERVER:/SHARE /tmp/mountdir

The above command hangs. Upon dumping the /proc/PID/stack I see the following

[<0>] lockd_up+0x1f/0x2b0 [lockd] [<0>] nlmclnt_init+0x28/0xc0 [lockd] [<0>] nfs_start_lockd+0xdd/0x120 [nfs] [<0>] nfs_init_server.isra.0+0x1f3/0x350 [nfs] [<0>] nfs_create_server+0x7f/0x220 [nfs] [<0>] nfs3_create_server+0xc/0x60 [nfsv3] [<0>] nfs_try_get_tree+0x12c/0x210 [nfs] [<0>] vfs_get_tree+0x22/0xc0 [<0>] path_mount+0x469/0xa40 [<0>] __x64_sys_mount+0x107/0x140 [<0>] do_syscall_64+0x38/0x90 [<0>] entry_SYSCALL_64_after_hwframe+0x61/0xcb

rpc-statd.service is running on the node

If I re-run the above command with -o nolock then the mount succeeds.

If I use NFSv4 (different NFS server and share) then the mount succeeds.

I tested the same NFSv3 mount on one of the older nodes in our cluster 2983.2.1 and v3 mount works fine.

Impact

This bug is preventing mount of NFSv3 shares. For one of our NFS share we have requirement to use V3 only.

Environment and steps to reproduce

  1. sudo mkdir /tmp/nfsv3
  2. sudo mount -t nfs -overs=3 nfs-server:/nfs-share /tmp/nfsv3

The command stalls and is Ctl-Cable

Expected behavior

mount using NFSv3 should succeed

Additional information

kthommandra avatar May 15 '24 17:05 kthommandra

Could someone look into this?

kthommandra avatar May 21 '24 02:05 kthommandra

bump

kthommandra avatar Jun 11 '24 09:06 kthommandra

@kthommandra hello, I can't reproduce the issue via a Kubernetes deployment. I tried with two different kernel versions:

$ uname -r
5.15.154-flatcar
$ kubectl exec -ti test-pod-1 -- mount | grep nfs
10.109.64.37:/export/pvc-61c8ad64-c700-4bdb-b996-586ef8bf59f0 on /test type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.109.64.37,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.109.64.37)
$ uname -r
6.6.30-flatcar
$ kubectl exec -ti pods/test-pod-1 -- mount | grep nfs
10.101.210.186:/export/pvc-10854062-5dc8-4dfe-99c3-62dedcf956ee on /test type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.101.210.186,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.101.210.186)

Did you try with newer Flatcar versions?

tormath1 avatar Jun 11 '24 11:06 tormath1

Hello @kthommandra,

In addition to @tormath1, I have also tried to reproduce the behaviour using:

  • an ubuntu 22.04 as NFS host, installed NFS as per the steps documented in this nice doc: https://www.digitalocean.com/community/tutorials/how-to-set-up-an-nfs-mount-on-ubuntu-22-04
  • a Flatcar box as client

There were no hangs in my testing.

@kthommandra, Can you please elaborate on how did you install the NFS server or what NFS hardware box are you using? Maybe there are some logs on the NFS box that might help us with the debug?

Thanks.

ader1990 avatar Jun 11 '24 11:06 ader1990

@kthommandra hello, I can't reproduce the issue via a Kubernetes deployment. I tried with two different kernel versions:

$ uname -r
5.15.154-flatcar
$ kubectl exec -ti test-pod-1 -- mount | grep nfs
10.109.64.37:/export/pvc-61c8ad64-c700-4bdb-b996-586ef8bf59f0 on /test type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.109.64.37,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.109.64.37)
$ uname -r
6.6.30-flatcar
$ kubectl exec -ti pods/test-pod-1 -- mount | grep nfs
10.101.210.186:/export/pvc-10854062-5dc8-4dfe-99c3-62dedcf956ee on /test type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.101.210.186,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.101.210.186)

Did you try with newer Flatcar versions?

I have not tried newer flatcar versions yet. Will do. The kernel version in 3510.2.1 is 5.15.106-flatcar. Can you try with this version ?

kthommandra avatar Jun 11 '24 12:06 kthommandra

Hello @kthommandra,

In addition to @tormath1, I have also tried to reproduce the behaviour using:

  • an ubuntu 22.04 as NFS host, installed NFS as per the steps documented in this nice doc: https://www.digitalocean.com/community/tutorials/how-to-set-up-an-nfs-mount-on-ubuntu-22-04
  • a Flatcar box as client

There were no hangs in my testing.

@kthommandra, Can you please elaborate on how did you install the NFS server or what NFS hardware box are you using? Maybe there are some logs on the NFS box that might help us with the debug?

Thanks.

We are using an NFS appliance and not Linux NFS server. We are working with the vendor to collect logs.

kthommandra avatar Jun 11 '24 12:06 kthommandra

Hello,

I am going to close this issue:

  • we have not been able to reproduce the issue (through the test suite and manually)
  • those Flatcar and Kernel versions (3510.2.1 / 5.15.106-flatcar) are very outdated and LTS 2023 is end of life in July

@kthommandra if you are still facing this issue with newer version, feel free to open a new issue with logs. Thanks!

tormath1 avatar Jun 27 '25 09:06 tormath1