bcachefs BUG sleeping function called from invalid context, not fixed

https://github.com/koverstreet/bcachefs/issues/893 ! With your change of spinlock to raw spinlock in rcu_pending.c, its still not fixed on RT I'm happy to work more with this?

[  111.566326] =============================
[  111.566327] [ BUG: Invalid wait context ]
[  111.566333] 6.15.0-kitty-TEST3 #26 Tainted: G        W
[  111.566339] -----------------------------
[  111.566341] mount/4161 is trying to lock:
[  111.566346] ffff0001ff50b8f0 ((&c->lock)){+.+.}-{3:3}, at: ___slab_alloc+0x78/0xd5c
[  111.566372] other info that might help us debug this:
[  111.566374] context-{5:5}
[  111.566376] 3 locks held by mount/4161:
[  111.566378]  #0: ffff000119504880 (&c->btree_trans_barrier){.+.+}-{0:0}, at: bch2_btree_path_traverse_one+0x89c/0x12a4
[  111.566392]  #1: ffff00010fe08140 (bcachefs_btree){+.+.}-{0:0}, at: bch2_trans_begin+0x138/0x5dc
[  111.566403]  #2: ffffffdfbf43ec20 (&p->lock#2){....}-{2:2}, at: rcu_pending_enqueue+0x68/0x504
[  111.566418] stack backtrace:
[  111.566423] CPU: 2 UID: 0 PID: 4161 Comm: mount Tainted: G        W           6.15.0-kitty-TEST3 #26 PREEMPT_RT
[  111.566430] Tainted: [W]=WARN
[  111.566432] Hardware name: Sweet Sunflower Kitty (DT)
[  111.566436] Call trace:
[  111.566439]  show_stack+0x14/0x1c (C)
[  111.566447]  dump_stack_lvl+0xbc/0x134
[  111.566457]  dump_stack+0x14/0x1c
[  111.566462]  __lock_acquire+0x7e4/0x19d8
[  111.566470]  lock_acquire+0x104/0x310
[  111.566475]  rt_spin_lock+0x40/0x188
[  111.566482]  ___slab_alloc+0x78/0xd5c
[  111.566489]  __kmalloc_cache_noprof+0x94/0x31c
[  111.566495]  __genradix_ptr_alloc+0x228/0x580
[  111.566504]  rcu_pending_enqueue+0x354/0x504
[  111.566511]  bch2_btree_key_cache_drop+0xe0/0x450
[  111.566517]  bch2_trans_commit_write_locked.isra.0+0x1374/0x1830
[  111.566525]  __bch2_trans_commit+0x8ac/0x114c
[  111.566530]  bch2_journal_replay+0x37c/0x954
[  111.566537]  __bch2_run_recovery_passes+0x224/0x7c8
[  111.566543]  bch2_run_recovery_passes+0x150/0x19c
[  111.566549]  bch2_fs_recovery+0xe14/0x1fac
[  111.566555]  bch2_fs_start+0x350/0xab0
[  111.566563]  bch2_fs_get_tree+0x788/0x9b4
[  111.566571]  vfs_get_tree+0x24/0xf0
[  111.566578]  path_mount+0x400/0xaf4
[  111.566586]  __arm64_sys_mount+0x120/0x298
[  111.566591]  do_el0_svc+0x68/0x12c
[  111.566599]  el0_svc+0x3c/0x11c
[  111.566607]  el0t_64_sync_handler+0x104/0x130
[  111.566612]  el0t_64_sync+0x168/0x16c
[  111.566621] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
[  111.566625] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 4161, name: mount
[  111.566628] preempt_count: 1, expected: 0
[  111.566630] RCU nest depth: 0, expected: 0
[  111.566631] INFO: lockdep is turned off.
[  111.566633] irq event stamp: 150286
[  111.566635] hardirqs last  enabled at (150285): [<ffff800083503a0c>] _raw_spin_unlock_irqrestore+0x80/0x84
[  111.566643] hardirqs last disabled at (150286): [<ffff800081033088>] rcu_pending_enqueue+0x22c/0x504
[  111.566650] softirqs last  enabled at (93366): [<ffff8000800ed1b0>] __local_bh_enable_ip+0x144/0x244
[  111.566656] softirqs last disabled at (93354): [<ffff80008280852c>] dw_mci_request+0x34/0x110
[  111.566667] CPU: 2 UID: 0 PID: 4161 Comm: mount Tainted: G        W 6.15.0-kitty-TEST3 #26 PREEMPT_RT
[  111.566672] Tainted: [W]=WARN
[  111.566674] Hardware name: Sweet Sunflower Kitty (DT)
[  111.566676] Call trace:
[  111.566677]  show_stack+0x14/0x1c (C)
[  111.566682]  dump_stack_lvl+0xbc/0x134
[  111.566690]  dump_stack+0x14/0x1c
[  111.566695]  __might_resched+0x13c/0x22c
[  111.566703]  rt_spin_lock+0x58/0x188
[  111.566710]  ___slab_alloc+0x78/0xd5c
[  111.566716]  __kmalloc_cache_noprof+0x94/0x31c
[  111.566722]  __genradix_ptr_alloc+0x228/0x580
[  111.566729]  rcu_pending_enqueue+0x354/0x504
[  111.566736]  bch2_btree_key_cache_drop+0xe0/0x450
[  111.566742]  bch2_trans_commit_write_locked.isra.0+0x1374/0x1830
[  111.566749]  __bch2_trans_commit+0x8ac/0x114c
[  111.566755]  bch2_journal_replay+0x37c/0x954
[  111.566761]  __bch2_run_recovery_passes+0x224/0x7c8
[  111.566767]  bch2_run_recovery_passes+0x150/0x19c
[  111.566773]  bch2_fs_recovery+0xe14/0x1fac
[  111.566778]  bch2_fs_start+0x350/0xab0
[  111.566785]  bch2_fs_get_tree+0x788/0x9b4
[  111.566792]  vfs_get_tree+0x24/0xf0
[  111.566799]  path_mount+0x400/0xaf4
[  111.566805]  __arm64_sys_mount+0x120/0x298
[  111.566810]  do_el0_svc+0x68/0x12c
[  111.566817]  el0_svc+0x3c/0x11c
[  111.566824]  el0t_64_sync_handler+0x104/0x130
[  111.566829]  el0t_64_sync+0x168/0x16c

May 29 '25 21:05 thememika

The problem is here

May 29 '25 21:05 thememika

When you allocate,interrupts are disabled

May 29 '25 21:05 thememika

(The preallocated branch wasn't taken)

May 29 '25 21:05 thememika

I will try GFP_NOWAIT instead of gfp kernel, but tomorrow in morning

May 29 '25 21:05 thememika

There is some problem that you somehow incorrectly determine your may_sleep var in specific case ,but gfp nowait can be a temporary solution ig (or not if there was no enough memory in the first place), or just not take the if(may_sleep) branch when irq off?

May 29 '25 22:05 thememika

[ 2682.023813] in_atomic(): 1, irqs_disabled(): 1 But why..

May 29 '25 22:05 thememika

yeah I had to yank the fix because it turned out to have a different issue, I'll finish it soon

May 30 '25 00:05 koverstreet

@koverstreet even with gfp atomic you can't allocate while holding raw_spin_lock

 *
 * %GFP_ATOMIC users can not sleep and need the allocation to succeed. A lower
 * watermark is applied to allow access to "atomic reserves".
 * The current implementation doesn't support NMI and few other strict
 * non-preemptive contexts (e.g. raw_spin_lock). The same applies to %GFP_NOWAIT.
 *

May 30 '25 08:05 thememika

@koverstreet

Jun 04 '25 13:06 thememika

@thememika I've been busy with higher priority bugs, one was a repair path that was taking out filesystems and I've got more work to do on repair code so other people can get their filesystems back over the next week.

Chill out and be patient. A bug that only affects PREEMPT_RT is not as high priority as bugs that have people's filesystems offline, but I will get to it.

Jun 04 '25 23:06 koverstreet

This is now fixed in 6.16-rc2

Jun 17 '25 17:06 koverstreet