Crash during OOM with your shrinker changes
Hello! I get an Oops when out_of_memory() is invoked, during its diagnostics. Im around commit 22145e66e491826707666bdd98e0b1ff198bdebe. Bcachefs is not even used on this system! (I just usually pull from your trees instead of Linus's)
[ 844.962567][ T3761] sysrq: Manual OOM execution
[ 844.965403][ T3997] kworker/0:3 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=-1, oom_score_adj=0
[ 844.965437][ T3997] CPU: 0 UID: 0 PID: 3997 Comm: kworker/0:3 Tainted: G W 6.15.0-rc4-kitty-TEST #8 PREEMPT_RT
[ 844.965446][ T3997] Tainted: [W]=WARN
[ 844.965449][ T3997] Hardware name: Sweet Sunflower Kitty (ACPI)
[ 844.965453][ T3997] Workqueue: events moom_callback
[ 844.965474][ T3997] Call trace:
[ 844.965478][ T3997] show_stack+0x14/0x1c (C)
[ 844.965488][ T3997] dump_stack_lvl+0xbc/0x134
[ 844.965498][ T3997] dump_stack+0x14/0x1c
[ 844.965502][ T3997] dump_header+0x3c/0x260
[ 844.965509][ T3997] oom_kill_process+0x148/0x5e0
[ 844.965514][ T3997] out_of_memory+0xd4/0x650
[ 844.965519][ T3997] moom_callback+0x7c/0xb4
[ 844.965525][ T3997] process_one_work+0x1fc/0x634
[ 844.965533][ T3997] worker_thread+0x1bc/0x360
[ 844.965538][ T3997] kthread+0x134/0x210
[ 844.965543][ T3997] ret_from_fork+0x10/0x20
[ 844.965554][ T3997] Mem-Info:
[ 844.965563][ T3997] active_anon:26 inactive_anon:0 isolated_anon:0
[ 844.965563][ T3997] active_file:45 inactive_file:68 isolated_file:0
[ 844.965563][ T3997] unevictable:127 dirty:0 writeback:0
[ 844.965563][ T3997] slab_reclaimable:420 slab_unreclaimable:1581
[ 844.965563][ T3997] mapped:37 shmem:127 pagetables:10
[ 844.965563][ T3997] sec_pagetables:6 bounce:0
[ 844.965563][ T3997] kernel_misc_reclaimable:0
[ 844.965563][ T3997] free:112098 free_pcp:166 free_cma:8144
[ 844.965578][ T3997] Node 0 active_anon:1664kB inactive_anon:0kB active_file:2880kB inactive_file:4352kB unevictable:8128kB isolated(anon):0kB isolated(file):0kB mapped:2368kB dirty:0kB writeback:0kB shmem:8128kB writeback_tmp:0kB kernel_stack:14272kB pagetables:640kB sec_pagetables:384kB all_unreclaimable? no Balloon:0kB
[ 844.965587][ T3997] Node 0 DMA free:3474048kB boost:0kB min:4992kB low:8448kB high:11904kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3930112kB managed:3481024kB mlocked:0kB bounce:0kB free_pcp:2432kB local_pcp:64kB free_cma:521216kB
[ 844.965601][ T3997] lowmem_reserve[]: 0 0 249 249 249
[ 844.965618][ T3997] Node 0 Normal free:3700224kB boost:0kB min:5888kB low:9920kB high:13952kB reserved_highatomic:0KB active_anon:1664kB inactive_anon:0kB active_file:2880kB inactive_file:4352kB unevictable:8128kB writepending:0kB present:4194304kB managed:4089664kB mlocked:0kB bounce:0kB free_pcp:8192kB local_pcp:2304kB free_cma:0kB
[ 844.965631][ T3997] lowmem_reserve[]: 0 0 0 0 0
[ 844.965646][ T3997] Node 0 DMA: 4*64kB (UM) 3*128kB (UM) 2*256kB (M) 1*512kB (M) 3*1024kB (MC) 2*2048kB (M) 4*4096kB (UMC) 5*8192kB (UMC) 4*16384kB (UMC) 4*32768kB (UMC) 3*65536kB (UMC) 3*131072kB (UMC) 2*262144kB (MC) 4*524288kB (M) = 3474048kB
[ 844.966246][ T3997] Node 0 Normal: 20*64kB (UME) 14*128kB (UM) 6*256kB (UME) 6*512kB (UME) 2*1024kB (M) 0*2048kB 3*4096kB (UME) 1*8192kB (U) 0*16384kB 2*32768kB (UE) 1*65536kB (E) 3*131072kB (UME) 2*262144kB (ME) 5*524288kB (M) = 3700224kB
[ 844.966336][ T3997] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=16777216kB
[ 844.966342][ T3997] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=524288kB
[ 844.966346][ T3997] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 844.966350][ T3997] 240 total pagecache pages
[ 844.966354][ T3997] 0 pages in swap cache
[ 844.966356][ T3997] Free swap = 0kB
[ 844.966359][ T3997] Total swap = 0kB
[ 844.966363][ T3997] 126944 pages RAM
[ 844.966366][ T3997] 0 pages HighMem/MovableOnly
[ 844.966368][ T3997] 8652 pages reserved
[ 844.966371][ T3997] 8192 pages cma reserved
[ 844.966373][ T3997] 0 pages hwpoisoned
[ 844.966409][ T3997] Unreclaimable slab info:
[ 844.968914][ T3997] kmalloc-128k total: 28.5 MiB active: 28.5 MiB
[ 844.968920][ T3997] kernfs_node_cache total: 9.06 MiB active: 9.05 MiB
[ 844.968923][ T3997] kmalloc-2k total: 3.88 MiB active: 3.87 MiB
[ 844.968928][ T3997] names_cache total: 2.88 MiB active: 2.88 MiB
[ 844.968931][ T3997] task_struct total: 2.28 MiB active: 2.21 MiB
[ 844.968934][ T3997] kmalloc-512 total: 2.19 MiB active: 2.17 MiB
[ 844.968937][ T3997] kmalloc-32k total: 2.00 MiB active: 2.00 MiB
[ 844.968939][ T3997] kmalloc-64k total: 2.00 MiB active: 2.00 MiB
[ 844.968942][ T3997] kmalloc-cg-64k total: 2.00 MiB active: 2.00 MiB
[ 844.968945][ T3997] shmem_inode_cache total: 1.71 MiB active: 1.71 MiB
[ 844.968948][ T3997]
[ 844.968951][ T3997] Shrinkers:
[ 844.972755][ T3997] Unable to handle kernel NULL pointer dereference at virtual address 00000000000006b0
[ 844.972783][ T3997] Mem abort info:
[ 844.972786][ T3997] ESR = 0x0000000096000005
[ 844.972790][ T3997] EC = 0x25: DABT (current EL), IL = 32 bits
[ 844.972797][ T3997] SET = 0, FnV = 0
[ 844.972801][ T3997] EA = 0, S1PTW = 0
[ 844.972805][ T3997] FSC = 0x05: level 1 translation fault
[ 844.972809][ T3997] Data abort info:
[ 844.972811][ T3997] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[ 844.972815][ T3997] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 844.972819][ T3997] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 844.972823][ T3997] user pgtable: 64k pages, 48-bit VAs, pgdp=0000000107e4a000
[ 844.972828][ T3997] [00000000000006b0] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[ 844.973553][ T3997] Internal error: Oops: 0000000096000005 [#1] SMP
[ 844.973562][ T3997] Modules linked in:
[ 844.973572][ T3997] CPU: 0 UID: 0 PID: 3997 Comm: kworker/0:3 Tainted: G W 6.15.0-rc4-kitty-TEST #8 PREEMPT_RT
[ 844.973580][ T3997] Tainted: [W]=WARN
[ 844.973581][ T3997] Hardware name: Sweet Sunflower Kitty (ACPI)
[ 844.973585][ T3997] Workqueue: events moom_callback
[ 844.973602][ T3997] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 844.973607][ T3997] pc : memcg_page_state+0x38/0x108
[ 844.973619][ T3997] lr : zswap_shrinker_count+0xd4/0x1ac
[ 844.973630][ T3997] sp : ffff8000a43cfa80
[ 844.973633][ T3997] x29: ffff8000a43cfa80 x28: 0000000000000000 x27: 0000000000000002
[ 844.973639][ T3997] x26: ffff000105ab9700 x25: ffff800086e45740 x24: 0000000000000cc0
[ 844.973645][ T3997] x23: ffff800086e457d0 x22: 0000000000000000 x21: 0000000000000001
[ 844.973650][ T3997] x20: ffff0001003f5048 x19: 0000000000000000 x18: 0000000000000018
[ 844.973656][ T3997] x17: 0000000000200000 x16: ffff000100011100 x15: ffffffffff08de68
[ 844.973661][ T3997] x14: ffff00010c6a3e00 x13: 0000000000000336 x12: ffff000100030000
[ 844.973667][ T3997] x11: ffff0001ff047380 x10: ffffffffffffffe0 x9 : ffff800098588050
[ 844.973673][ T3997] x8 : ffff8000876666b0 x7 : 0000000000000108 x6 : 0000000000000000
[ 844.973678][ T3997] x5 : ffff800091331500 x4 : 0000000000000001 x3 : 0000000000000025
[ 844.973683][ T3997] x2 : 0000000000000035 x1 : 0000000000000035 x0 : 0000000000000000
[ 844.973689][ T3997] Call trace:
[ 844.973693][ T3997] memcg_page_state+0x38/0x108 (P)
[ 844.973702][ T3997] zswap_shrinker_count+0xd4/0x1ac
[ 844.973710][ T3997] shrinkers_to_text+0x9c/0x2c8
[ 844.973719][ T3997] __show_mem+0x1bc/0x234
[ 844.973729][ T3997] dump_header+0xcc/0x260
[ 844.973736][ T3997] oom_kill_process+0x148/0x5e0
[ 844.973741][ T3997] out_of_memory+0xd4/0x650
[ 844.973746][ T3997] moom_callback+0x7c/0xb4
[ 844.973753][ T3997] process_one_work+0x1fc/0x634
[ 844.973763][ T3997] worker_thread+0x1bc/0x360
[ 844.973768][ T3997] kthread+0x134/0x210
[ 844.973775][ T3997] ret_from_fork+0x10/0x20
[ 844.973789][ T3997] Code: 91066063 3862c863 7103fc7f 54000140 (f9435a73)
[ 844.973799][ T3997] ---[ end trace 0000000000000000 ]---
#
that looks like a zswap bug, you'll want to file it with them
@koverstreet but this crash didnt exist before your shrinker changes,?
and all we're doing is calling the standard callback in the normal way...
will you merge rc5,rc6 to your testing tree @koverstreet ?
Fixed this recently, folded the fix into the shrinker debug patchset
No, not fixed. You never ever pressed sysrq+m or sysrq+f to just test your own changes? Sad.
I did, but it's not even possible for me to test every subsystem with a shrinker and all the different kconfig options, unfortunately.
Yes, it sucks.
Are you sure you're hitting the same bug? Can you give me a log?
On Tue, Jul 29, 2025, 3:59 PM y.mi. @.***> wrote:
thememika left a comment (koverstreet/bcachefs#886) https://github.com/koverstreet/bcachefs/issues/886#issuecomment-3134056926
No, not fixed. You never ever pressed sysrq+m or sysrq+f to just test your own changes? Sad.
— Reply to this email directly, view it on GitHub https://github.com/koverstreet/bcachefs/issues/886#issuecomment-3134056926, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAPGX3T5V23ZOKIFAWLB2NL3K7OD3AVCNFSM6AAAAAB5KKHQUSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTCMZUGA2TMOJSGY . You are receiving this because you modified the open/close state.Message ID: @.***>
@koverstreet
[ 92.636534][ T179] sysrq: Show Memory
[ 92.636546][ T179] Mem-Info:
[ 92.636551][ T179] active_anon:0 inactive_anon:140 isolated_anon:0
[ 92.636551][ T179] active_file:1186 inactive_file:3939 isolated_file:0
[ 92.636551][ T179] unevictable:4 dirty:1 writeback:0
[ 92.636551][ T179] slab_reclaimable:2132 slab_unreclaimable:37355
[ 92.636551][ T179] mapped:899 shmem:4 pagetables:269
[ 92.636551][ T179] sec_pagetables:0 bounce:0
[ 92.636551][ T179] kernel_misc_reclaimable:0
[ 92.636551][ T179] free:451193 free_pcp:12197 free_cma:0
[ 92.636567][ T179] Node 0 active_anon:0kB inactive_anon:560kB active_file:4744kB inactive_file:15756kB unevictable:16kB isolated(anon):0kB isolated(file):0kB mapped:3596kB dirty:4kB writeback:0kB shmem:16kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:0kB writeback_tmp:0kB kernel_stack:9464kB pagetables:1076kB sec_pagetables:0kB all_unreclaimable? no Balloon:0kB
[ 92.636581][ T179] Node 0 DMA free:15356kB boost:0kB min:424kB low:528kB high:632kB reserved_highatomic:0KB free_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 92.636598][ T179] lowmem_reserve[]: 0 2375 2375 2375 2375
[ 92.636610][ T179] Node 0 DMA32 free:1789416kB boost:0kB min:67156kB low:83944kB high:100732kB reserved_highatomic:0KB free_highatomic:0KB active_anon:0kB inactive_anon:560kB active_file:4744kB inactive_file:15756kB unevictable:16kB writepending:4kB present:3092728kB managed:2432320kB mlocked:0kB bounce:0kB free_pcp:48788kB local_pcp:8172kB free_cma:0kB
[ 92.636625][ T179] lowmem_reserve[]: 0 0 0 0 0
[ 92.636635][ T179] Node 0 Normal free:0kB boost:0kB min:0kB low:0kB high:0kB reserved_highatomic:0KB free_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:509952kB managed:0kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 92.636647][ T179] lowmem_reserve[]: 0 0 0 0 0
[ 92.636657][ T179] Node 0 DMA: 1*4kB (U) 1*8kB (U) 1*16kB (U) 1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 1*512kB (U) 0*1024kB 1*2048kB (M) 3*4096kB (M) = 15356kB
[ 92.636699][ T179] Node 0 DMA32: 580*4kB (UE) 265*8kB (UM) 185*16kB (UM) 362*32kB (UM) 333*64kB (U) 295*128kB (U) 241*256kB (UME) 124*512kB (UME) 25*1024kB (UE) 4*2048kB (ME) 379*4096kB (M) = 1789416kB
[ 92.636742][ T179] Node 0 Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
[ 92.636769][ T179] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 92.636774][ T179] 5148 total pagecache pages
[ 92.636777][ T179] 0 pages in swap cache
[ 92.636779][ T179] Free swap = 0kB
[ 92.636782][ T179] Total swap = 0kB
[ 92.636784][ T179] 904668 pages RAM
[ 92.636787][ T179] 0 pages HighMem/MovableOnly
[ 92.636789][ T179] 292748 pages reserved
[ 92.636791][ T179] 0 pages cma reserved
[ 92.636793][ T179] 0 pages hwpoisoned
[ 92.636823][ T179] Unreclaimable slab info:
[ 92.637161][ T179] kmemleak_object total: 62.7 MiB active: 62.7 MiB
[ 92.637165][ T179] kmalloc-4k total: 17.3 MiB active: 17.2 MiB
[ 92.637242][ T179] kernfs_node_cache total: 10.4 MiB active: 10.4 MiB
[ 92.637246][ T179] kmalloc-2k total: 5.81 MiB active: 5.67 MiB
[ 92.637248][ T179] kmalloc-8k total: 4.38 MiB active: 4.19 MiB
[ 92.637251][ T179] kmalloc-512 total: 2.88 MiB active: 2.84 MiB
[ 92.637253][ T179] Acpi-State total: 2.81 MiB active: 2.78 MiB
[ 92.637256][ T179] kmalloc-1k total: 2.44 MiB active: 2.29 MiB
[ 92.637258][ T179] Acpi-Operand total: 2.38 MiB active: 1.89 MiB
[ 92.637260][ T179] drbd_req total: 2.25 MiB active: 2.25 MiB
[ 92.637262][ T179]
[ 92.637271][ T179] Shrinkers:
[ 92.637335][ T179] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN
[ 92.637341][ T179] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[ 92.637350][ T179] CPU: 0 UID: 0 PID: 179 Comm: busybox Tainted: G S B W 6.16.9-kitty-sob-kurumuru-5-kasan-memcheck #4 PREEMPT_{RT,LAZY}
[ 92.637361][ T179] Tainted: [S]=CPU_OUT_OF_SPEC, [B]=BAD_PAGE, [W]=WARN
[ 92.637364][ T179] Hardware name: ASUSTeK COMPUTER INC. X200CA/X200CA, BIOS X200CA.208 09/18/2013
[ 92.637367][ T179] RIP: 0010:strlen+0x27/0x80
[ 92.637379][ T179] Code: cc cc cc f3 0f 1e fa 41 57 41 56 41 54 53 49 be 00 00 00 00 00 fc ff df 48 89 f9 48 89 fb 0f 1f 44 00 00 48 89 c8 48 c1 e9 03 <42> 0f b6 0c 31 84 c9 75 0f 80 3b 00 48 8d 5b 01 48 8d 48 01 75 e3
[ 92.637384][ T179] RSP: 0018:ffffc900151df9e8 EFLAGS: 00010246
[ 92.637390][ T179] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 92.637395][ T179] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 92.637398][ T179] RBP: 0000000000000000 R08: ffff88801ba932df R09: 1ffff1100375265b
[ 92.637402][ T179] R10: dffffc0000000000 R11: ffffed100375265c R12: ffffc900151dfca0
[ 92.637405][ T179] R13: ffff88803c5b9000 R14: dffffc0000000000 R15: dffffc0000000000
[ 92.637409][ T179] FS: 00007ffff7c9c740(0000) GS:ffff88811df2c000(0000) knlGS:0000000000000000
[ 92.637414][ T179] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 92.637418][ T179] CR2: 00007ffff7ff138c CR3: 0000000033762004 CR4: 00000000001726f0
[ 92.637423][ T179] Call Trace:
[ 92.637427][ T179] <TASK>
[ 92.637433][ T179] seq_buf_puts+0x27/0x180
[ 92.637443][ T179] shrinker_to_text+0xb9/0x300
[ 92.637454][ T179] shrinkers_to_text+0x52d/0x600
[ 92.637464][ T179] ? vprintk_emit+0x806/0x900
[ 92.637478][ T179] __show_mem+0x2f0c/0x2fc0
[ 92.637499][ T179] ? _printk+0x4f/0x80
[ 92.637510][ T179] __handle_sysrq+0x3c8/0x700
[ 92.637518][ T179] ? __handle_sysrq+0xc6/0x700
[ 92.637526][ T179] write_sysrq_trigger+0x85/0xc0
[ 92.637533][ T179] proc_reg_write+0x11a/0x240
[ 92.637543][ T179] vfs_write+0x1c3/0x880
[ 92.637553][ T179] ? rt_mutex_slowunlock.26931+0x3f5/0x7c0
[ 92.637560][ T179] ? lock_release+0x35/0x480
[ 92.637569][ T179] __x64_sys_write+0x10f/0x1c0
[ 92.637577][ T179] ? entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 92.637583][ T179] do_syscall_64+0x6d/0x300
[ 92.637589][ T179] entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 92.637594][ T179] RIP: 0033:0x7ffff7dce4b0
[ 92.637601][ T179] Code: 99 0d 00 64 c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 80 3d 79 1d 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89
[ 92.637605][ T179] RSP: 002b:00007fffffffe2d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 92.637612][ T179] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007ffff7dce4b0
[ 92.637616][ T179] RDX: 0000000000000002 RSI: 000055555567d040 RDI: 0000000000000001
[ 92.637619][ T179] RBP: 000055555567d040 R08: fefefefefefefeff R09: fefefefefefeff6c
[ 92.637623][ T179] R10: 0000000000000008 R11: 0000000000000202 R12: 0000000000000002
[ 92.637626][ T179] R13: 00007ffff7c9c6c8 R14: 0000000000000000 R15: 00007fffffffe340
[ 92.637637][ T179] </TASK>
[ 92.637639][ T179] Modules linked in:
[ 92.637645][ T179] ---[ end trace 0000000000000000 ]---
[ 92.637649][ T179] RIP: 0010:strlen+0x27/0x80
[ 92.637655][ T179] Code: cc cc cc f3 0f 1e fa 41 57 41 56 41 54 53 49 be 00 00 00 00 00 fc ff df 48 89 f9 48 89 fb 0f 1f 44 00 00 48 89 c8 48 c1 e9 03 <42> 0f b6 0c 31 84 c9 75 0f 80 3b 00 48 8d 5b 01 48 8d 48 01 75 e3
[ 92.637659][ T179] RSP: 0018:ffffc900151df9e8 EFLAGS: 00010246
[ 92.637664][ T179] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 92.637667][ T179] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 92.637670][ T179] RBP: 0000000000000000 R08: ffff88801ba932df R09: 1ffff1100375265b
[ 92.637673][ T179] R10: dffffc0000000000 R11: ffffed100375265c R12: ffffc900151dfca0
[ 92.637677][ T179] R13: ffff88803c5b9000 R14: dffffc0000000000 R15: dffffc0000000000
[ 92.637681][ T179] FS: 00007ffff7c9c740(0000) GS:ffff88811df2c000(0000) knlGS:0000000000000000
[ 92.637685][ T179] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 92.637688][ T179] CR2: 00007ffff7ff138c CR3: 0000000033762004 CR4: 00000000001726f0
EDIT1:
seq_buf_puts is in ./include/linux/fortify-string.h:268.
---
if (p_size == SIZE_MAX)
return __underlying_strlen(p);
---
shrinker_to_text is in mm/shrinker.c:837.
---
seq_buf_putc(out, '\n');
---
shrinkers_to_text is in mm/shrinker.c:910.
---
... for (i = nr - 1; i >= 0; --i)
... shrinker_to_text(out, shrinkers_by_mem[i].shrinker);
---
So it's seq_buf pointer causing crash? Seq buf is created in mm/show_mem:495
printk("Shrinkers:\n");
seq_buf_init(&s, buf, buf_size);
It's not the same, but it's obvious that somewhere null pointer is passed to function(s) that in the end dereferences it
In the build where you tested, you had config memcg on?
yes, I most certainly did test with memcg
it appears that we can't rely on shrinker->name being initialized
oh ok, I changed
seq_buf_puts(out, shrinker->name);
seq_buf_putc(out, '\n');
to
if (unlikely(!shrinker->name))
seq_buf_puts(out, "(null)");
else
seq_buf_puts(out, shrinker->name);
seq_buf_putc(out, '\n');
Will test when I can
@koverstreet works!
so that's interesting, I was reading over the code again yesterday and it looks like it shouldn't be possible for shrinker->name to be NULL, maybe we're somehow seeing a partially initialized shrinker, or one that's being torn down?
@koverstreet but we take shrinkers lock
Considering that it always happens (I did lots of attempts), I don't think that it can be a race with removal/add of shrinker
I'll try a torture test using while true; do (echo m > /proc/sysrq-trigger); done in conjunction with the null check, to be more confident that we can rely on it just being null (and it's not a race)
The best way of course is to track where that shrinker was created and how
So I tried a bit of kdb and and printk instrumentation, and all shrinkers at the point of shrinker_register() have name = null.
Here is an example of a very-early-created shrinker.
<6>[ 0.458628][ T0] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
<0>[ 0.459262][ T0] ==== NEW SHRINKER in shrinker_register() !!!
<4>[ 0.459265][ T0] struct shrinker
<4>[ 0.459268][ T0] {
<4>[ 0.459270][ T0] const char * name = "(null)"
<4>[ 0.459274][ T0] unsigned long (*)(struct shrinker *, struct shrink_control *) count_objects = (____ptrval____)
<4>[ 0.459280][ T0] unsigned long (*)(struct shrinker *, struct shrink_control *) scan_objects = (____ptrval____)
<4>[ 0.459283][ T0] void (*)(struct seq_buf *, struct shrinker *) to_text = 0000000000000000
<4>[ 0.459287][ T0] long batch = 0
<4>[ 0.459290][ T0] int seeks = 2
<4>[ 0.459292][ T0] unsigned int flags = 2
<4>[ 0.459295][ T0] refcount_t refcount =
<4>[ 0.459297][ T0] {
<4>[ 0.459299][ T0] atomic_t refs =
<4>[ 0.459302][ T0] {
<4>[ 0.459303][ T0] int counter = 0
<4>[ 0.459306][ T0] }
<4>[ 0.459308][ T0] }
<4>[ 0.459310][ T0] struct completion done =
<4>[ 0.459312][ T0] {
<4>[ 0.459315][ T0] unsigned int done = 0
<4>[ 0.459317][ T0] struct swait_queue_head wait =
<4>[ 0.459319][ T0] {
<4>[ 0.459321][ T0] raw_spinlock_t lock =
<4>[ 0.459323][ T0] {
<4>[ 0.459325][ T0] arch_spinlock_t raw_lock =
<4>[ 0.459328][ T0] {
<4>[ 0.459330][ T0] atomic_t val =
<4>[ 0.459332][ T0] {
<4>[ 0.459334][ T0] int counter = 0
<4>[ 0.459337][ T0] }
<4>[ 0.459339][ T0] u8 locked = 0
<4>[ 0.459342][ T0] u8 pending = 0
<4>[ 0.459344][ T0] u16 locked_pending = 0
<4>[ 0.459347][ T0] u16 tail = 0
<4>[ 0.459350][ T0] }
<4>[ 0.459351][ T0] unsigned int magic = 0
<4>[ 0.459354][ T0] unsigned int owner_cpu = 0
<4>[ 0.459357][ T0] void * owner = 0000000000000000
<4>[ 0.459360][ T0] struct lockdep_map dep_map =
<4>[ 0.459362][ T0] {
<4>[ 0.459364][ T0] struct lock_class_key * key = 0000000000000000
<4>[ 0.459367][ T0] struct lock_class *[2] class_cache = *(____ptrval____)
<4>[ 0.459370][ T0] const char * name = "(null)"
<4>[ 0.459373][ T0] u8 wait_type_outer = 0
<4>[ 0.459376][ T0] u8 wait_type_inner = 0
<4>[ 0.459378][ T0] u8 lock_type = 0
<4>[ 0.459381][ T0] int cpu = 0
<4>[ 0.459384][ T0] unsigned long ip = 0
<4>[ 0.459387][ T0] }
<4>[ 0.459388][ T0] }
<4>[ 0.459390][ T0] struct list_head task_list =
<4>[ 0.459392][ T0] {
<4>[ 0.459394][ T0] struct list_head * next = 0000000000000000
<4>[ 0.459397][ T0] struct list_head * prev = 0000000000000000
<4>[ 0.459400][ T0] }
<4>[ 0.459402][ T0] }
<4>[ 0.459403][ T0] }
<4>[ 0.459405][ T0] struct callback_head rcu =
<4>[ 0.459407][ T0] {
<4>[ 0.459409][ T0] struct callback_head * next = 0000000000000000
<4>[ 0.459412][ T0] void (*)(struct callback_head *) func = 0000000000000000
<4>[ 0.459415][ T0] }
<4>[ 0.459417][ T0] void * private_data = 0000000000000000
<4>[ 0.459419][ T0] struct list_head list =
<4>[ 0.459421][ T0] {
<4>[ 0.459423][ T0] struct list_head * next = 0000000000000000
<4>[ 0.459426][ T0] struct list_head * prev = 0000000000000000
<4>[ 0.459429][ T0] }
<4>[ 0.459431][ T0] int id = 0
<4>[ 0.459433][ T0] atomic_long_t * nr_deferred = (____ptrval____)
<4>[ 0.459436][ T0] atomic_long_t objects_requested_to_free =
<4>[ 0.459438][ T0] {
<4>[ 0.459440][ T0] s64 counter = 0
<4>[ 0.459443][ T0] }
<4>[ 0.459445][ T0] atomic_long_t objects_freed =
<4>[ 0.459446][ T0] {
<4>[ 0.459448][ T0] s64 counter = 0
<4>[ 0.459451][ T0] }
<4>[ 0.459452][ T0] unsigned long last_freed = 0
<4>[ 0.459455][ T0] unsigned long last_scanned = 0
<4>[ 0.459458][ T0] atomic64_t ns_run =
<4>[ 0.459459][ T0] {
<4>[ 0.459461][ T0] s64 counter = 0
<4>[ 0.459464][ T0] }
<4>[ 0.459465][ T0] }
<4>[ 0.459506][ T0]
<4>[ 0.459517][ T0] CPU: 0 UID: 0 PID: 0 Comm: swapper Tainted: G S 6.16.9-kitty-sob-kurumuru-6-kasan-memcheck #11 PREEMPT_{RT,LAZY}
<4>[ 0.459533][ T0] RIP: 0010:shrinker_register+0xd94/0x1100
<4>[ 0.459583][ T0] Call Trace:
<4>[ 0.459588][ T0] <TASK>
<4>[ 0.459611][ T0] vmalloc_init+0xb86/0xbc0
<4>[ 0.459640][ T0] mm_core_init+0x5b/0x80
<4>[ 0.459647][ T0] start_kernel+0x198/0x440
<4>[ 0.459656][ T0] x86_64_start_reservations+0x24/0x40
<4>[ 0.459664][ T0] x86_64_start_kernel+0x145/0x200
<4>[ 0.459675][ T0] common_startup_64+0x13e/0x167
<4>[ 0.459708][ T0] </TASK>
Why? I found. shrinker_register() is called here from vmalloc_init():
// mm/vmalloc.c:5296
shrinker_register(vmap_node_shrinker);
Prior to that, vmalloc_init allocates the shrinker using:
vmap_node_shrinker = shrinker_alloc(0, "vmap-node");
shrinker_alloc has prototype struct shrinker *shrinker_alloc(unsigned int flags, const char *fmt, ...), which makes it obvious that it takes a formatted name, which it will set to the shrinker (it should, right?).
Okay, lets see it inside. It uses these va args only one time. To call shrinker_debugfs_name_alloc(shrinker, fmt, ap);
In that function, we see this picture:
// mm/interal.h
#ifdef CONFIG_SHRINKER_DEBUG
static inline __printf(2, 0) int shrinker_debugfs_name_alloc(
struct shrinker *shrinker, const char *fmt, va_list ap)
{
shrinker->name = kvasprintf_const(GFP_KERNEL, fmt, ap);
return shrinker->name ? 0 : -ENOMEM;
}
...
#else /* CONFIG_SHRINKER_DEBUG */
...
static inline int shrinker_debugfs_name_alloc(struct shrinker *shrinker,
const char *fmt, va_list ap)
{
return 0;
}
...
#endif /* CONFIG_SHRINKER_DEBUG */
Looking at my config:
# CONFIG_SHRINKER_DEBUG is not set
Yes. So it absolutely can be null and you should fix your code. Not everyone has this 'shrinker debug' enabled.
and, finally fixed