Peter Cawley
Peter Cawley
I won't be at FOSDEM this year. Those of you that are, have fun!
I will be around from the afternoon of the 2nd through until the afternoon of the 6th, staying at JAM.
Looking at the actual LOOP section of the two traces, they seem pretty much 99% instruction-for-instruction identical: ``` ->LOOP: ->LOOP: 2a4e987a mov r9d, 0x5abb 2a4e9880 xor r8d, r8d 2a4de033 xor...
The thing which immediately jumps out at me as interesting is that some traces are expensive in GC64, but not in GC32. For example, `TRACE_126::apps/rss/rss.lua:187` takes 5.27% of the time...
IIRC, I removed the `mainthref` field because you need a `global_State*` in order to use it, but once you have a `global_State*`, you can just get the main thread via...
> Assembler support for apple AMX in LLVM/GCC This bit is not strictly required; [aarch64.h](https://github.com/corsix/amx/blob/main/aarch64.h) works with unmodified compilers. > ... kernel ... On the technical front, you've got extra...
Done in [42a391b](https://github.com/corsix/amx/commit/42a391b9b77c9191fd8b8fa39b05e5f67cb6bd48).
RE 1, my assumption is that we're seeing ISA evolution; there was an AMX on iPhone hardware before AMX on M1, and my guess is that `mac16`/`fm[as][16,32,64]` were in the...