serenity icon indicating copy to clipboard operation
serenity copied to clipboard

Kernel: Upheld scheduler lock resulting in a kernel panic

Open konradekk opened this issue 2 years ago • 3 comments

Iʼve encountered a following kernel panic; dump in the console:

706.453 WebContent(75:75): ResourceLoader: Starting load of: "https://github.githubassets.com/assets/sessions-239675566f74.js"
707.098 WebContent(75:75): Parse error! [handle_in_body @ ./Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp:2076]
707.098 WebContent(75:75): Parse error! [handle_in_body @ ./Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp:1718]
707.470 WebContent(75:75): ResourceLoader: Starting load of: "https://avatars.githubusercontent.com/u/51659396?s=64&v=4"
707.753 WebContent(75:75): ResourceLoader: Starting load of: "https://avatars.githubusercontent.com/u/51659396?v=4"
707.850 WebContent(75:75): ResourceLoader: Starting load of: "https://github.githubassets.com/images/modules/profile/achievements/pair-extraordinaire-default.png"
707.884 WebContent(75:75): ResourceLoader: Starting load of: "https://github.githubassets.com/images/modules/profile/achievements/pull-shark-default.png"
707.923 WebContent(75:75): ResourceLoader: Starting load of: "https://github.githubassets.com/images/modules/profile/achievements/arctic-code-vault-contributor-default.png"
708.034 WebContent(75:75): ResourceLoader: Starting load of: "https://avatars.githubusercontent.com/u/50811782?s=64&v=4"
711.241 WebContent(75:75): Parse error! [handle_in_body @ ./Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp:2076]
711.241 WebContent(75:75): Parse error! [handle_in_body @ ./Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp:1718]
[SystemServer(24:24)]: ASSERTION FAILED: !g_scheduler_lock.is_locked_by_current_processor()
[SystemServer(24:24)]: ./Kernel/Thread.cpp:186 in Kernel::Thread::block_impl(const BlockTimeout&, Blocker&)::<lambda()>
[SystemServer(24:24)]: KERNEL PANIC! :^(
[SystemServer(24:24)]: Aborted
[SystemServer(24:24)]: at ./Kernel/Arch/x86/common/CPU.cpp:35 in void abort()
[SystemServer(24:24)]: Kernel + 0x0000000000bff4ff  Kernel::__panic(char const*, unsigned int, char const*) +0x16f
[SystemServer(24:24)]: Kernel + 0x0000000001147661  abort.localalias +0x38a
[SystemServer(24:24)]: Kernel + 0x00000000011472d7  abort.localalias +0x0
[SystemServer(24:24)]: Kernel + 0x0000000000ffb3fe  AK::Function<void ()>::CallableWrapper<Kernel::Thread::block_impl(Kernel::Thread::BlockTimeout const&, Kernel::Thread::Blocker&)::{lambda()#2}>::call() +0x4de
[SystemServer(24:24)]: Kernel + 0x0000000001086365  AK::Function<void ()>::operator()() const +0x7a5
[SystemServer(24:24)]: Kernel + 0x0000000001083a31  AK::Function<void ()>::CallableWrapper<Kernel::TimerQueue::fire()::{lambda(Kernel::TimerQueue::Queue&)#1}::operator()(Kernel::TimerQueue::Queue&) const::{lambda()#1}>::call() +0x2b1
[SystemServer(24:24)]: Kernel + 0x0000000001187b00  Kernel::Processor::deferred_call_execute_pending() [clone .localalias] +0x2d0
[SystemServer(24:24)]: Kernel + 0x0000000001194e71  Kernel::Processor::exit_trap(Kernel::TrapFrame&) +0x151
[SystemServer(24:24)]: Kernel + 0x00000000011a2620  exit_trap +0x90
[SystemServer(24:24)]: Kernel + 0x0000000001144166  common_trap_exit +0x8
[SystemServer(24:24)]: Kernel + 0x00000000009c7dd8  Kernel::Memory::Region::handle_fault(Kernel::PageFault const&) +0xf78
[SystemServer(24:24)]: Kernel + 0x000000000116bb96  page_fault_handler +0x636
[SystemServer(24:24)]: Kernel + 0x0000000001169616  page_fault_asm_entry +0x36
[SystemServer(24:24)]: Kernel + 0x0000000000d0cbfc  copy_to_user(void*, void const*, unsigned long) +0x13c
[SystemServer(24:24)]: Kernel + 0x0000000000fe5cad  Kernel::Thread::dispatch_signal(unsigned char) [clone .localalias] +0x134d
[SystemServer(24:24)]: Kernel + 0x0000000000cffc2e  Kernel::Scheduler::context_switch(Kernel::Thread*) [clone .localalias] +0x2fe
[SystemServer(24:24)]: Kernel + 0x0000000000d003b0  Kernel::Scheduler::pick_next() [clone .localalias] +0x160
[SystemServer(24:24)]: Kernel + 0x0000000001195c74  Kernel::Processor::clear_critical() +0x114
[SystemServer(24:24)]: Kernel + 0x0000000000fd6798  Kernel::Thread::yield_without_releasing_big_lock(Kernel::Thread::VerifyLockNotHeld) [clone .localalias] +0x2f8
[SystemServer(24:24)]: Kernel + 0x0000000000ffea28  Kernel::Thread::block_impl(Kernel::Thread::BlockTimeout const&, Kernel::Thread::Blocker&) [clone .localalias] +0x7d8
[SystemServer(24:24)]: Kernel + 0x0000000000e5cf40  Kernel::Process::sys$poll(AK::Userspace<Kernel::Syscall::SC_poll_params const*>) +0x1d60
[SystemServer(24:24)]: Kernel + 0x0000000000d15d63  syscall_handler +0x1283
[SystemServer(24:24)]: Kernel + 0x00000000011a27a1  syscall_entry +0x51

Snapshot of the display: kernel-panic

Not sure what I did exactly but roughly: opened a README.md (waited to open completely), clicked on a link there and closed the text editor (could be also other way around?); then, I think, the page did not even end loading before panic happened.

Hopefully, the above is helpful in any way…! 😯

konradekk avatar Oct 15 '22 21:10 konradekk

I see my name in the browser tab on the taskbar there, what a coincidence :smile:

I've seen this kernel panic being happening randomly, also is a big part of CI flakes from what I observed on my pull requests' CI runs. The recursive Spinlock we support is a giant can of worms and bugs, but still not sure how it's difficult to remove that feature. It is definitely could be a sign of using a recursive Spinlock and then acquiring a Mutex, which is illegal to do because preemption cannot done in a sane manner. When booting into SMP mode, this kind of exact kernel panic happens much more often (which also leads me to the conclusion it's one of the last SMP stability issues we still have?).

supercomputer7 avatar Oct 16 '22 16:10 supercomputer7

Well… taskbar stuff was pretty random! 😉

Shall I close the issue or you want to use it track the progress? 🤔 (or do close it on your own if applicable given necessary permissions! 🙇🏻‍♂️)

konradekk avatar Oct 16 '22 20:10 konradekk

This is a bug we need to fix, so I will not close this until we have a proper fix in place :)

supercomputer7 avatar Oct 17 '22 12:10 supercomputer7