ndctl
ndctl copied to clipboard
create/destroy namespace yields kernel warning messages
Two processes running: (1) while true, ndctl create (2) while true and namespace found (created in 1); ndctl destroy all
[ 2433.123249] WARNING: CPU: 4 PID: 55873 at
drivers/nvdimm/namespace_devs.c:916 size_store+0x7c2/0x910 [libnvdimm]
[....]
[ 2433.123282] RIP: 0010:size_store+0x7c2/0x910 [libnvdimm]
[ 2433.123290] Call Trace:
[ 2433.123297] dev_attr_store+0x1b/0x25
[ 2433.123302] sysfs_kf_write+0x3f/0x46
[ 2433.123304] kernfs_fop_write+0x124/0x1a3
[ 2433.123309] __vfs_write+0x3a/0x16d
[ 2433.123312] ? __fd_install+0x31/0xce
[ 2433.123314] vfs_write+0xb2/0x1a1
[ 2433.123318] ? syscall_trace_enter+0x1ce/0x2b8
[ 2433.123320] SyS_write+0x55/0xb9
[ 2433.123321] do_syscall_64+0x79/0x1b9
[ 2433.123325] entry_SYSCALL_64_after_hwframe+0x169/0x0
This script can reproduce this issue.
#!/bin/bash
set -x
# create/destroy namespace yields kernel warning messages
#######################################################
BASEDIR=$(dirname $(realpath $0))
TESTNAME=$(basename $0 .sh)
NDCTL=$(which ndctl)
LOGDIR=$BASEDIR/logdir
rm -rf $LOGDIR
mkdir -p $LOGDIR
NdctlCreate()
{
local region=$1
while true; do
SetSize=$(($RANDOM % 5 + 1))
cmd_create="$NDCTL create-namespace --region=$region --mode=devdax --size=${SetSize}G"
eval $cmd_create >> $LOGDIR/${TESTNAME}.create.${region}.log 2>&1
[ $? -ne 0 ] && echo "$cmd_create - FAIL" | tee -a $LOGDIR/${TESTNAME}.create.${region}.log
done
}
NdctlDestroy()
{
while true; do
NamespaceList=$($NDCTL list -N | jq -r .[].dev | tr -s '\n' ' ')
Date=$(date '+%H%M%S')
if [ -n "$NamespaceList" ]; then
echo "== $Date: destroy - $NamespaceList" | tee -a $LOGDIR/${TESTNAME}.destroy.log
cmd_destroy="$NDCTL destroy-namespace all -f"
eval $cmd_destroy >> $LOGDIR/${TESTNAME}.destroy.log 2>&1
else
echo "Empty - no namespace to destroy" | tee -a $LOGDIR/${TESTNAME}.destroy.log
fi
done
}
### MAIN
NdctlCreate region1 & PID1=$!
NdctlDestroy & PID2=$!
PanicDef=$(dmesg | grep -c "Call Trace")
while [ $(jobs -p | wc -l) -gt 0 ]; do
sleep 5
CntPanic=$(dmesg | grep -c "Call Trace")
if [ $CntPanic -gt $PanicDef ]; then
kill -TERM $PID1 $PID2
dmesg | awk '/cut here/ ,/end trace/' | tee -a $LOGDIR/${TESTNAME}.panic.log
fi
done
Hi Scott,
Is this with Dan's "libnvdimm: Fix async operations and locking" Patchset applied (on libnvdimm-pending)? I ran the script on the nfit_test bus and a qemu based ACPI.NFIT bus, with a libnvdimm-pending kernel, and couldn't reproduce it.
Ah, letting it run long enough on the qemu ACPI.NFIT bus, I did hit this:
[65958.613893] ------------[ cut here ]------------
[65958.614764] kernel BUG at mm/page_alloc.c:489!
[65958.615450] invalid opcode: 0000 [#1] SMP PTI
[65958.616382] CPU: 4 PID: 7551 Comm: ndctl Tainted: G O 5.2.0-rc4+ #17
[65958.617692] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.org 04/01/2014
[65958.621558] RIP: 0010:set_pfnblock_flags_mask+0x8d/0xe0
[65958.625447] Code: c1 eb 37 83 e0 03 48 69 c0 40 07 00 00 48 03 04 dd a0 9d 7d 82 48 8b 58 60 48 39 da 73 0e 48 c7 c6 28 ae 33 82 e8 13 b4 fd ff <0f> 0b 48 03 58 70 48 39 da 73 e9 49 01 c9 b9 3f 00 00 00 44 29 c9
[65958.631655] RSP: 0018:ffffc9000124ba40 EFLAGS: 00010246
[65958.632450] RAX: 000000000000004a RBX: 0000000000858000 RCX: 0000000000000000
[65958.633587] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff82339a90
[65958.635005] RBP: ffff8883122d3340 R08: 000059fda5e85e1e R09: 0000000000000000
[65958.636597] R10: 0000000000000000 R11: 0000000000000000 R12: 000000000013ae00
[65958.637816] R13: 0000000000978000 R14: 007fff8000000000 R15: 0000000103e9daef
[65958.639000] FS: 00007f3df24f0600(0000) GS:ffff888318c00000(0000) knlGS:0000000000000000
[65958.640378] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[65958.641366] CR2: 00007f9d4f4f8f60 CR3: 00000001738f6003 CR4: 00000000001606e0
[65958.642758] Call Trace:
[65958.643222] memmap_init_zone_device+0x145/0x17f
[65958.643910] devm_memremap_pages+0x4e7/0x6b0
[65958.644731] dev_dax_probe+0xef/0x240 [device_dax]
[65958.645653] really_probe+0xef/0x390
[65958.646212] ? driver_allows_async_probing+0x50/0x50
[65958.646969] driver_probe_device+0xb4/0x100
[65958.647622] ? driver_allows_async_probing+0x50/0x50
[65958.648380] bus_for_each_drv+0x78/0xc0
[65958.648977] __device_attach+0xd9/0x160
[65958.649584] bus_probe_device+0x9c/0xb0
[65958.650171] device_add+0x410/0x670
[65958.650824] __devm_create_dev_dax+0x110/0x190
[65958.651443] __dax_pmem_probe+0x1ef/0x210 [dax_pmem_core]
[65958.652201] dax_pmem_probe+0xc/0x20 [dax_pmem]
[65958.652832] nvdimm_bus_probe+0x66/0x160 [libnvdimm]
[65958.653520] really_probe+0xef/0x390
[65958.654023] driver_probe_device+0xb4/0x100
[65958.654605] device_driver_attach+0x4f/0x60
[65958.655187] bind_store+0xc9/0x110
[65958.655654] kernfs_fop_write+0xf0/0x1a0
[65958.656193] vfs_write+0xba/0x1c0
[65958.656663] ksys_write+0x5f/0xe0
[65958.657114] do_syscall_64+0x60/0x240
[65958.657611] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[65958.658291] RIP: 0033:0x7f3df27fdfa8
[65958.658896] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 75 77 0d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
[65958.661400] RSP: 002b:00007ffd0e047cf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[65958.662430] RAX: ffffffffffffffda RBX: 000000000189d040 RCX: 00007f3df27fdfa8
[65958.663400] RDX: 0000000000000007 RSI: 000000000189d040 RDI: 0000000000000004
[65958.664355] RBP: 0000000000000007 R08: 0000000000000007 R09: 0000000000000006
[65958.665300] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
[65958.667176] R13: 00007f3df24f0528 R14: 0000000000000001 R15: 000000000189d040
[65958.668136] Modules linked in: kmem xfs nfit_test(O) dax_pmem_compat(O) device_dax(O) nfit(O) nd_blk(O) dax_pmem(O) nd_pmem(O) dax_pmem_core(O) nd_btt(O) nd_e820(O) libnvdimm(O) nfit_test_iomap(O) [last unloaded: nfit_test_iomap]
[65958.670883] ---[ end trace ecaeee5608fb71ff ]---
and to answer your previous question, yes, this panic has been seen with the "async operations" patch set in libnvdimm-pending.
@swdavenport are you still seeing the warning with the v5.3-rc3 kernel? The "kernel BUG at mm/page_alloc.c" issue seems to be unrelated, but nonetheless concerning. Just checking if there are still 2 issues to debug, or just the one.
@djbw The "kernel BUG at mm/page_alloc.c" issue was seen when running the above script in two separate terminals simultaneously. You're correct, that was a different issue which I can confirm is fixed. With that, I'm seeing 400+ seg faults after running them for 2+hours. The backtraces are the same, here is a sample, perhaps ndctl could use some checking?
(gdb) bt
#0 0x00007f224658a730 in ndctl_pfn_get_bus () from /lib64/libndctl.so.6
#1 0x00007f224658a749 in ndctl_pfn_get_ctx () from /lib64/libndctl.so.6
#2 0x00007f224658ab87 in ndctl_pfn_set_uuid () from /lib64/libndctl.so.6
#3 0x000000000040b188 in setup_namespace ()
#4 0x000000000040bf24 in do_xaction_namespace ()
#5 0x000000000040c338 in cmd_create_namespace ()
#6 0x000000000041b10b in main_handle_internal_command ()
#7 0x0000000000408788 in main ()
(gdb) p $_siginfo.si_signo
$1 = 11
(gdb) p $_siginfo.si_code
$3 = 1
(gdb) p $_siginfo._sifields._sigfault.si_addr
$4 = (void *) 0x8
(gdb) info proc
exe = '/usr/bin/ndctl create-namespace --region=region1 --mode=devdax --size=3G'
(gdb) info r
rdx 0x16695e8 23500264
rsi 0x7fff11ed20e0 140733494141152
rdi 0x0 0
rip 0x7f224658a730 0x7f224658a730 <ndctl_pfn_get_bus>
eflags 0x10206 [ PF IF RF ]
(gdb) disass ndctl_pfn_get_bus,+0x20
Dump of assembler code from 0x7f224658a730 to 0x7f224658a750:
=> 0x00007f224658a730 <ndctl_pfn_get_bus+0>: mov 0x8(%rdi),%rax
0x00007f224658a734 <ndctl_pfn_get_bus+4>: mov 0x8(%rax),%rax
0x00007f224658a738 <ndctl_pfn_get_bus+8>: retq
The size_store() issue is still there which can be seen by running the script in one terminal.
There has been some development in stabilizing the kernel wrt pmem operations. Please test with a recent kernel.