qthreads
qthreads copied to clipboard
Add support for fast context switch on M1
A faster alternative to ucontext on Apple M1 hw is desirable. Ine path is porting the current asm aarch64 implementation to Apple's object format (Mach-O). https://developer.apple.com/library/archive/documentation/Performance/Conceptual/CodeFootprint/Articles/MachOOverview.html Does ABT support fast context switch on M1 hw?
The discussion on not storing signal masks is covered here: https://github.com/Qthreads/qthreads/pull/95
Regarding argobots, they have separate by very close fast context implementations for elf/linux and macho/mac, both slightly adapted from those in boost. See https://github.com/pmodels/argobots/tree/main/src/arch/fcontext
As I have proposed before, those implementations don't save all the registers. However, we never found a definitive reason why only certain registers were saved. My guess is since most of ULT assumes a coroutine model, my guess is that when the compiler sees yield, it guarantees that all state in caller saved registers are put into memory and reloaded when resumes from a yield. That's my guess.
Yes, @cjhackillinois I recall that. My intent was not that @janciesko should use those implementations, but that by comparing them to each other he could get an idea of what he'd need to change in the Qthreads arm64 Linux swapcontext code to arrive at the right m1 mac code.
Does our current arm64 code we have work on M1?
Does our current arm64 code we have work on M1?
The format of the assembly for M1 is slightly different, so no. But the changes should be minor to create the m1 version of the code.
Looking at the argobots elf vs macho code, the main differences I see are
- .align 2 vs .balign16
- underscores to start function names
- .size
- .section