ndctl icon indicating copy to clipboard operation
ndctl copied to clipboard

crash on create/destroy namespace

Open Sean58238 opened this issue 4 years ago • 1 comments

1 issue There is a probability that ndctl can crash during stress test. Udev:KERNEL=="dax*",SUBSYSTEMS=="nd", ATTRS{uuid}=="?*", SYMLINK+="disk/by-id/dax-$attr{uuid}" crash> bt PID: 94249 TASK: ffff88c45f708000 CPU: 1 COMMAND: "systemd-udevd" #0 [ffffc90021ed3490] machine_kexec at ffffffff8106002b #1 [ffffc90021ed34e8] __crash_kexec at ffffffff81139ab2 #2 [ffffc90021ed35b8] crash_kexec at ffffffff8113adbc #3 [ffffc90021ed35d8] oops_end at ffffffff81024fd1 #4 [ffffc90021ed3600] no_context at ffffffff8106d2f2 #5 [ffffc90021ed3670] __bad_area_nosemaphore at ffffffff8106d680 #6 [ffffc90021ed36c8] bad_area_nosemaphore at ffffffff8106d826 #7 [ffffc90021ed36d8] do_kern_addr_fault at ffffffff8106d92a #8 [ffffc90021ed3700] __do_page_fault at ffffffff8106dcb7 #9 [ffffc90021ed3768] do_page_fault at ffffffff8106ddf0 #10 [ffffc90021ed37a0] page_fault at ffffffff81c010a4 [exception RIP: memcpy_erms+6] RIP: ffffffff81b4f866 RSP: ffffc90021ed3858 RFLAGS: 00010246 RAX: ffff8884a1aa7000 RBX: 0000000000001000 RCX: 0000000000001000 RDX: 0000000000001000 RSI: ffffc90080200000 RDI: ffff8884a1aa7000 RBP: ffffc90021ed38f8 R8: ffff88c45f708000 R9: 0000000000001000 R10: 0000000000000000 R11: 000000001286a9c0 R12: 0000000000000000 R13: 0000000000001000 R14: 0000000000001000 R15: ffffc90080200000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #11 [ffffc90021ed3858] pmem_do_bvec at ffffffff815fbf59 #12 [ffffc90021ed3900] pmem_rw_page at ffffffff815fc24b #13 [ffffc90021ed3928] bdev_read_page at ffffffff812defa6 #14 [ffffc90021ed3958] do_mpage_readpage at ffffffff812e73f4 #15 [ffffc90021ed3a20] mpage_readpages at ffffffff812e76b9 #16 [ffffc90021ed3af8] blkdev_readpages at ffffffff812df51d #17 [ffffc90021ed3b08] read_pages at ffffffff811f6fb7 #18 [ffffc90021ed3b88] __do_page_cache_readahead at ffffffff811f72bf #19 [ffffc90021ed3c28] force_page_cache_readahead at ffffffff811f7730 #20 [ffffc90021ed3c60] page_cache_sync_readahead at ffffffff811f7826 #21 [ffffc90021ed3c98] generic_file_buffered_read at ffffffff811eb9f4 #22 [ffffc90021ed3da0] generic_file_read_iter at ffffffff811ede20 #23 [ffffc90021ed3de8] blkdev_read_iter at ffffffff812df6d7 #24 [ffffc90021ed3df8] new_sync_read at ffffffff8129662a #25 [ffffc90021ed3e90] __vfs_read at ffffffff81298ef9 #26 [ffffc90021ed3ea0] vfs_read at ffffffff81298f9e #27 [ffffc90021ed3ed8] ksys_read at ffffffff81299301 #28 [ffffc90021ed3f18] __x64_sys_read at ffffffff8129938a #29 [ffffc90021ed3f28] do_syscall_64 at ffffffff810029cf #30 [ffffc90021ed3f50] entry_SYSCALL_64_after_hwframe at ffffffff81c0008c RIP: 00007f9ec04566f0 RSP: 00007ffc07704968 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000200000 RCX: 00007f9ec04566f0 RDX: 0000000000000400 RSI: 000055edf6bf6dd8 RDI: 0000000000000009 RBP: 000055edf6bf6db0 R8: 0000000005231994 R9: 0000000000000428 R10: 0000000000000040 R11: 0000000000000246 R12: 000055edf6bb2bf0 R13: 0000000000000400 R14: 000055edf6bb2c40 R15: 000055edf6bf6dc8 ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b

2 Test scripts: #!/bin/bash k=0 while true do echo "1" >> ndlog for ((i=0;i<=7;++i)) do ndctl create-namespace -r region0 -m devdax -s 63G -n empt-0-0-$i done echo "2" >> ndlog for ((i=0;i<=7;++i)) do ndctl create-namespace -r region0 -m devdax -s 63G -n empt-0-1-$i done echo "3" >> ndlog for ((i=0;i<=7;++i)) do ndctl create-namespace -r region1 -m devdax -s 63G -n empt-1-0-$i done echo "4" >> ndlog for ((i=0;i<=7;++i)) do ndctl create-namespace -r region1 -m devdax -s 63G -n empt-1-1-$i done echo "5" >> ndlog

line=`ndctl list --namespaces  |grep '"size":66586673152'|wc -l`

if [ "$line" -ne 32 ];then
    echo "ERROR create" >> ndlog
    exit
fi
daxline=`ls /dev/disk/by-id/|grep dax|wc -l`
if [ "$daxline" -ne 32 ];then
    echo "ERROR link" >> ndlog
    exit
fi
((++k))
date >> ndlog
echo "done $k" >> ndlog
ndctl destroy-namespace all --force
[[ $? -ne 0 ]] && echo "ERROR destroy" && exit

done

Kernel: 5.4.32 ndctl : v68

Sean58238 avatar Aug 10 '21 01:08 Sean58238

This is the wrong place to report. ndctl is not responsible for kernel crashes, kernel is.

Also it should be fixed with current kernel versions (such as 5.15).

hramrach avatar Jan 14 '22 14:01 hramrach