blktests icon indicating copy to clipboard operation
blktests copied to clipboard

blktests nvmet configfs not clean up after blktests nvme/063 failed

Open yizhanglinux opened this issue 2 months ago • 6 comments

Here is the log:

# nvme_trtype=tcp ./check nvme/063
nvme/063 (tr=tcp) (Create authenticated TCP connections with secure concatenation)
    runtime  12.836s  ...
WARNING: Test did not clean up port: 0
WARNING: Test did not clean up subsystem: blktests-subsystem-1
rmdir: failed to remove '/sys/kernel/config/nvmet//subsystems/blktests-subsystem-1': Directory not empty
nvme/063 (tr=tcp) (Create authenticated TCP connections with secure concatenation) [failed]3-51e60b8de349
    runtime  12.836s  ...  12.396sel/config/nvmet//hosts/nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349': Device or resource busy
    --- tests/nvme/063.out	2025-11-01 02:12:55.080172746 +0000
    +++ /root/blktests/results/nodev_tr_tcp/nvme/063.out.bad	2025-11-01 06:22:26.595821890 +0000
    @@ -1,7 +1,3 @@
     Running nvme/063
     Test secure concatenation with SHA256
    -Reset controller
    -disconnected 1 controller(s)
    -Test secure concatenation with SHA384
    -disconnected 1 controller(s)
    -Test complete
    ...
    (Run 'diff -u tests/nvme/063.out /root/blktests/results/nodev_tr_tcp/nvme/063.out.bad' to see the entire diff)
WARNING: Test did not clean up subsystem: blktests-subsystem-1
rmdir: failed to remove '/sys/kernel/config/nvmet//subsystems/blktests-subsystem-1': Directory not empty
WARNING: Test did not clean up host: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
rmdir: failed to remove '/sys/kernel/config/nvmet//hosts/nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349': Device or resource busy
# nvmetcli ls
o- / ......................................................................................................................... [...]
  o- hosts ................................................................................................................... [...]
  | o- nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 .................................................. [...]
  o- ports ................................................................................................................... [...]
  o- subsystems .............................................................................................................. [...]
    o- blktests-subsystem-1 ................................................ [version=2.1, allow_any=0, serial=6fb3461f55817e7f8194]
      o- allowed_hosts ....................................................................................................... [...]
      | o- nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 .............................................. [...]
      o- namespaces .......................................................................................................... [...]

yizhanglinux avatar Nov 01 '25 06:11 yizhanglinux

We can use below patch to reproduce the issue:

diff --git a/tests/nvme/063 b/tests/nvme/063
index 5bfe8be..42447c8 100755
--- a/tests/nvme/063
+++ b/tests/nvme/063
@@ -51,11 +51,11 @@ test() {
        _nvme_connect_subsys --dhchap-secret "${hostkey}" --concat

        ctrl=$(_find_nvme_dev "${def_subsysnqn}")
-       if [[ -z "$ctrl" ]]; then
+#      if [[ -z "$ctrl" ]]; then
                echo "WARNING: connection failed"
                _systemctl_stop
                return 1
-       fi
+#      fi
        tlskey=$(_nvme_ctrl_tls_key "$ctrl" || true)
        if [[ -z "$tlskey" ]]; then
                echo "WARNING: connection is not encrypted"

yizhanglinux avatar Nov 02 '25 13:11 yizhanglinux

I tried the change [1], and the issue seems fixed now. [1]

diff --git a/common/nvme b/common/nvme
index 3d43790..e0657a2 100644
--- a/common/nvme
+++ b/common/nvme
@@ -196,6 +196,9 @@ _cleanup_nvmet() {
                for ns in "${subsys}"/namespaces/*; do
                        rmdir "${ns}"
                done
+               for allowed_host in "${subsys}"/allowed_hosts/*; do
+                       rm -f $allowed_host
+               done
                rmdir "${subsys}"
        done

[2]

# nvme_trtype=tcp ./check nvme/063
nvme/063 (tr=tcp) (Create authenticated TCP connections with secure concatenation)
    runtime  2.220s  ...
WARNING: Test did not clean up port: 0
nvme/063 (tr=tcp) (Create authenticated TCP connections with secure concatenation) [failed]
    runtime  2.220s  ...  2.217sost: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
    --- tests/nvme/063.out	2025-11-02 02:47:38.826227836 -0500
    +++ /root/blktests/results/nodev_tr_tcp/nvme/063.out.bad	2025-11-02 08:43:09.743150576 -0500
    @@ -1,7 +1,3 @@
     Running nvme/063
     Test secure concatenation with SHA256
    -Reset controller
    -disconnected 1 controller(s)
    -Test secure concatenation with SHA384
    -disconnected 1 controller(s)
    -Test complete
    ...
    (Run 'diff -u tests/nvme/063.out /root/blktests/results/nodev_tr_tcp/nvme/063.out.bad' to see the entire diff)

yizhanglinux avatar Nov 02 '25 13:11 yizhanglinux

Another failure with nvme/rdma nvme/061

# use_rxe=1   nvme_trtype=rdma ./check nvme/058 nvme/061
nvme/058 (tr=rdma) (test rapid namespace remapping)          [passed]
    runtime  4.617s  ...  4.569s
nvme/061 (tr=rdma) (test fabric target teardown and setup during I/O)
    runtime  16.890s  ...
WARNING: Test did not clean up port: 0
WARNING: Test did not clean up subsystem: blktests-subsystem-1
rmdir: failed to remove '/sys/kernel/config/nvmet//subsystems/blktests-subsystem-1': Directory not empty
WARNING: Test did not clean up host: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
nvme/061 (tr=rdma) (test fabric target teardown and setup during I/O) [failed]press:uuid:0f01fb42-9f7f-4856-b0b3-51e6    runtime  16.890s  ...  10.998s
    --- tests/nvme/061.out	2025-11-03 02:39:37.518511450 -0500
    +++ /root/blktests/results/nodev_tr_rdma/nvme/061.out.bad	2025-11-03 03:05:49.329593397 -0500
    @@ -4,18 +4,9 @@
     state: live
     iteration 1
     state: connecting
    -state: live
    -iteration 2
    -state: connecting
    -state: live
    ...
    (Run 'diff -u tests/nvme/061.out /root/blktests/results/nodev_tr_rdma/nvme/061.out.bad' to see the entire diff)
WARNING: Test did not clean up subsystem: blktests-subsystem-1
rmdir: failed to remove '/sys/kernel/config/nvmet//subsystems/blktests-subsystem-1': Directory not empty
WARNING: Test did not clean up host: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
rmdir: failed to remove '/sys/kernel/config/nvmet//hosts/nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349': Device or resource busy
# nvmetcli ls
o- / .......................................................................................................... [...]
  o- hosts .................................................................................................... [...]
  | o- nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 ................................... [...]
  o- ports .................................................................................................... [...]
  o- subsystems ............................................................................................... [...]
    o- blktests-subsystem-1 ................................. [version=2.1, allow_any=0, serial=f3e8fd2c6a5d1ad81d8e]
      o- allowed_hosts ........................................................................................ [...]
      | o- nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 ............................... [...]
      o- namespaces ........................................................................................... [...]

yizhanglinux avatar Nov 03 '25 08:11 yizhanglinux

The patch in https://github.com/linux-blktests/blktests/issues/207#issuecomment-3477976668 seems reasonable to me but I think it should also be addressed in the kernel (configfs.c). When the subsys is removed and nvmet_port_subsys_drop_link is called, it should also do the work in nvmet_allowed_host_drop_link first, I think.

igaw avatar Nov 03 '25 12:11 igaw

The patch in #207 (comment) seems reasonable to me but I think it should also be addressed in the kernel (configfs.c). When the subsys is removed and nvmet_port_subsys_drop_link is called, it should also do the work in nvmet_allowed_host_drop_link first, I think.

Yeah, that looks reasonable. Could you create the patch on the configfs side?

yizhanglinux avatar Nov 05 '25 02:11 yizhanglinux

The patch in #207 (comment) seems reasonable to me but I think it should also be addressed in the kernel (configfs.c). When the subsys is removed and nvmet_port_subsys_drop_link is called, it should also do the work in nvmet_allowed_host_drop_link first, I think.

Yeah, that looks reasonable. Could you create the patch on the configfs side?

I'll look into it, though has low prio for me. So if anyone once to work on something reasonable simple on the kernel, let me know :)

igaw avatar Nov 25 '25 09:11 igaw