Theseus Use preemption-safe locks instead of interrupt-safe locks wherever possible

Theseus uses the MutexIrqSafe type to ensure that interrupts are disabled when accessing the inner data. This is quite useful for sensitive contexts where deadlock could otherwise occur if another execution context is allowed to take over while a regular spinlock Mutex might be held. For example, usage of MutexIrqSafe is required whenever sharing data between a regular execution context and an interrupt handler context.

However, we've used it far too ubiquitously in Theseus, primarily due to our initial goal to reduce unnecessary OS states; this did make sense in the pursuit of that goal, because whether or not preemption is currently enabled is another bit of per-core state that must be atomically accessible by all tasks on a given CPU core.

In many cases it simply suffices to disable preemption (task switching) so we don't context switch to another task while a sensitive lock is held. Such contexts are most commonly found in the task management code itself, i.e., in the task and spawn crates' functions. This would improve general I/O performance and interactivity of the system by allowing interrupts to be enabled during more regular execution contexts that don't actually have anything to do with interrupt handling.

Jul 29 '22 02:07 tsoutsman

This is now in progress since #595 is merged. I will first work on using preemption guards in scheduling and task management, since that's the clearest avenue for efficiency wins.

Aug 11 '22 01:08 kevinaboos

As I mused earlier, another good idea may be to introduce a more fine-grained interrupt-safe lock crate that disables only certain interrupts by masking those IRQs.

For example, if a task is accessing state that's shared with only the e1000 NIC driver's interrupt handler, then the lock surrounding that shared state can simply mask the e1000 IRQ instead of disabling all interrupts.

However, we'd have to create a lint or some form of code analysis to check this, otherwise we'd be walking head-first into a wall of potential deadlocks.

Aug 11 '22 01:08 kevinaboos

#600 paves the way for properly handling (enabling/disabling) preemption and interrupts during task switching.

Aug 12 '22 18:08 kevinaboos

#603 implements this for schedule() and task_switch()

Aug 16 '22 23:08 kevinaboos

#616 implements this for task lifecycle functions that cleanup task states after a task has exited.

Aug 23 '22 18:08 kevinaboos

I think the only remaining obvious area for this is runqueue management, which I can tackle soon.

Aug 24 '22 21:08 kevinaboos

With #629 merged in, I think this issue can be closed.

Looking through the code base, the only other areas where interrupt-safe locks could be removed are:

the heap implementation(s) -- but this requires a complete redesign to leverage Rust's new-ish allocator API where you can explicitly choose an allocator when instantiating a heap type. The idea being that regular task contexts would use the default allocator such that it wouldn't need an interrupt-safe lock, while interrupt handler contexts would use a dedicated allocator (whose usage should be quite rare), which would require interrupt-safe locks.eeeeeeeeeeeeeeeeeeeeeeeeeee
waitqueues -- also in need of a redesign, which is planned.
channel implementations -- an easy follow-up todo item after waitqueues are redesigned.
logging -- the later "full" logger should use a ring buffer or some dedicated memory region by default instead of just dumping things directly to a serial port. Right now we keep it that way for simplicity and to guarantee synchronous, immediate logging (with no flushing required), but in the future that behavior can be relegated to a special cfg option to help debug Theseus when things go awry during early boot.

@tsoutsman let me know if you have observed any other areas of the system that can be quickly addressed; if not, let's close this.

Aug 25 '22 22:08 kevinaboos

The only thing I'll add to that is sleep, which is currently implemented using PIT interrupts. But that's also in need of a redesign.

Aug 31 '22 02:08 tsoutsman

Theseus Theseus copied to clipboard

Use preemption-safe locks instead of interrupt-safe locks wherever possible

Theseus
Theseus copied to clipboard