atop kernel soft lockup triggered by atop

Hello,

we've got a bunch of servers running RHEL 8.4 EUS regularly having the atop process in a soft lockup state. Being a soft lockup, this has nothing to do with atop itself, but maybe this could be of interest for someone.

Here is what we've got in the system logs:

[1773773.348593] watchdog: BUG: soft lockup - CPU#11 stuck for 23s! [atop:3440032]
[1773773.365333] Modules linked in: dm_mod netlink_diag overlay tcp_diag inet_diag 8021q garp stp mrp llc sch_ingress wekafsio(OE) wekafsgw(OE) uio_pci_generic(OE) igb_uio(OE) uio squashfs nfsv3 nfs_acl nfs lockd grace fscache nft_objref intel_rapl_msr intel_rapl_common nft_counter amd64_edac_mod edac_mce_amd amd_energy nf_log_ipv4 nf_log_common kvm_amd nft_reject_ipv4 nf_reject_ipv4 nft_reject kvm irqbypass nft_log rdma_ucm(OE) crct10dif_pclmul rdma_cm(OE) iw_cm(OE) crc32_pclmul ghash_clmulni_intel rapl pcspkr nf_tables_set nf_tables nfnetlink ib_ipoib(OE) ib_cm(OE) ib_umad(OE) ipmi_ssif drm_vram_helper drm_ttm_helper i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops sp5100_tco acpi_ipmi drm joydev ccp ipmi_si k10temp i2c_piix4 ipmi_devintf ipmi_msghandler acpi_cpufreq xfs libcrc32c loop mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) crc32c_intel mlx5_core(OE) psample mlxdevm(OE) auxiliary(OE) nvme mlx_compat(OE) ahci nvme_core tls libahci ixgbe mlxfw(OE) libata t10_pi
[1773773.365371]  pci_hyperv_intf dca mdio pinctrl_amd sunrpc
[1773773.551648] CPU: 11 PID: 3440032 Comm: atop Tainted: G           OEL   --------- -  - 4.18.0-305.49.1.el8_4.x86_64 #1
[1773773.573212] Hardware name: Supermicro Super Server/H12DST-B, BIOS 2.4 04/19/2022
[1773773.591425] RIP: 0010:proc_cpuset_show+0xae/0x150
[1773773.606717] Code: ef 45 31 e4 e8 13 ab 19 00 be 0a 00 00 00 4c 89 ef e8 f6 a5 19 00 48 89 ef e8 1e b5 13 00 5b 44 89 e0 5d 41 5c 41 5d c3 f3 90 <49> 8b 84 24 08 0d 00 00 48 8b 18 f6 43 7c 01 75 85 48 8b 43 18 a8
[1773773.646574] RSP: 0018:ffffb59a6a073e10 EFLAGS: [00000202](https://access.redhat.com/support/cases/#/case/00000202) ORIG_RAX: ffffffffffffff13
[1773773.664576] RAX: 000049f42bc02d7b RBX: ffff8b974db08400 RCX: 0000000000000001
[1773773.682040] RDX: 00000000006000c0 RSI: 0000000000000000 RDI: ffff8b2787c06840
[1773773.699383] RBP: ffff8b338f2dd000 R08: ffffb59a6a073dc8 R09: 0000000000000000
[1773773.716638] R10: ffff8b338f2db000 R11: 0000000000000000 R12: ffff8b6b0ac25f00
[1773773.733770] R13: ffff8b658bb06900 R14: ffffffffbb0583a0 R15: ffff8b658bb06900
[1773773.750730] FS:  00007f7d7c521740(0000) GS:ffff8b65cdac0000(0000) knlGS:0000000000000000
[1773773.768661] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1773773.784071] CR2: 000014da5c02b028 CR3: 00000021e8af6000 CR4: 0000000000350ee0
[1773773.800922] Call Trace:
[1773773.812971]  proc_single_show+0x4e/0x90
[1773773.826389]  seq_read+0x163/0x420
[1773773.839249]  vfs_read+0x91/0x140
[1773773.852063]  ksys_read+0x4f/0xb0
[1773773.864670]  do_syscall_64+0x5b/0x1a0
[1773773.877583]  entry_SYSCALL_64_after_hwframe+0x65/0xca
[1773773.891880] RIP: 0033:0x7f7d7b4215a5
[1773773.904664] Code: fe ff ff 50 48 8d 3d f2 f6 09 00 e8 f5 fd 01 00 0f 1f 44 00 00 f3 0f 1e fa 48 8d 05 f5 6f 2d 00 8b 00 85 c0 75 0f 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 53 c3 66 90 41 54 49 89 d4 55 48 89 f5 53 89
[1773773.942441] RSP: 002b:00007ffcb29155a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[1773773.959826] RAX: ffffffffffffffda RBX: 00000000009a72a0 RCX: 00007f7d7b4215a5
[1773773.976872] RDX: 0000000000000400 RSI: 00000000009a7490 RDI: 00000000000000c7
[1773773.993963] RBP: 0000000000000d68 R08: 0000000000000001 R09: 0000000000000000
[1773774.010994] R10: 00007f7d7c521740 R11: 0000000000000246 R12: 00007f7d7b6ef880
[1773774.028006] R13: 00007f7d7b6f03c0 R14: 00000000000000ff R15: 00000000009a72a0

We've opened a ticket with RedHat with a vmcore and they have identified a potential bug in the kernel. The kernel team is looking at it to see if a backport is possible. I don't know if RHEL 8.6 + is affected.

The fix might reside in this commit: https://github.com/torvalds/linux/commit/18fa84a2db0e15b02baa5d94bdb5bd509175d2f6

At some point, atop is trying to read the virtual /proc/[pid]/cpuset set, which trigger the soft lockup. In the meantime, I think I will recompile atop and remove the call to proccont function in the photoproc function: https://github.com/Atoptool/atop/blob/d9e0e6a703f152e3e17195afb227596d9983ca4e/photoproc.c#L873

I'll update this issue as soon as I have more information.

Jul 26 '22 14:07 jbd

The fix might reside in this commit: https://github.com/torvalds/linux/commit/18fa84a2db0e15b02baa5d94bdb5bd509175d2f6

RHEL support just said that it will need to be backported to RHEL8. The patch has been introduced upstream around kernel 5.2. The internal RedHat bug report is BZ#2111491.

Jul 27 '22 12:07 jbd

This will be fixed in the next RHEL 8.6 kernel in mid-september.

Aug 29 '22 14:08 jbd

atop atop copied to clipboard

kernel soft lockup triggered by atop

atop
atop copied to clipboard