ARM64EC: What even is it?
In addition to the four major architectures (x86, x64, ARM, ARM64), we support CHPE and ARM64EC. In particular, ARM64EC is both a priority and actively evolving. However, as library devs, we have a highly incomplete and confused understanding of how this works. It would be extremely helpful to have documentation (either on Microsoft Docs or just on the STL Wiki) explaining a few important things for library devs. In particular:
- The high-level scenario of how these binaries are compiled, what processor they run on, and what code they interact with (i.e. how ARM64EC interacts with x64 and ARM64)
- What intrinsics are available
- Can we use ARM64 intrinsics?
- Can we use x64 intrinsics? (Some subset of them?)
- What's emulated versus native?
I recall getting some answers to these questions in various emails but have forgotten, so having a single up-to-date reference would be very helpful.

I believe this MS docs page should answer a lot of questions: Understanding Arm64EC ABI and assembly code
Nicole's understanding of the situation from discussions over coffee:
- Arm64EC is code which is compiled to target ARM64 (so the machine code can be run directly), but whose ABI allows emulated x64 code to call into it (and allows it to call into emulated x64 code) without any extra ABI conversion cost.
- For example, Arm64EC's registers map exactly to x64 registers, as they're used in emulated x64 code;
RCXisX0in emulated x64-on-Arm64 code, so Arm64EC usesX0to mean the same thing asRCX; this also means that any registers which don't map to an x64 register (likex13) are unuseable in ARM64EC.
- For example, Arm64EC's registers map exactly to x64 registers, as they're used in emulated x64 code;
- You compile ARM64EC code by using the ARM64 targetting compiler with the
/arm64ECoption, and then link it with/machine:arm64ec(I can't confirm that yet, will try) - Given this, my assumption is that the NEON intrinsics should be available?
Also useful page: https://docs.microsoft.com/en-us/windows/arm/arm64ec-abi
I believe that ARM64 intrinsics should be available, and run natively on the processor. I'm not sure if x64 intrinsics are available, but if they are they definitely are emulated (or the compiler could convert the intrinsic to a native ARM64 instruction under the hood)
And from discussion with Pranav (backend person who works with arm64EC):
- all intel SIMD instructions, excluding AVX, work by emulation (there's a .lib file automatically linked into ARM64 libraries that does this emulation for you),
- all ARM64 SIMD instructions work natively, and
- ARM64EC implies
_M_ARM64EC && _M_X64.
- all intel SIMD instructions, excluding AVX, work by emulation (there's a .lib file automatically linked into ARM64 libraries that does this emulation for you),
We already know that it is not true because simulation fails. @cbezault may recall more.
- all intel SIMD instructions, excluding AVX, work by emulation (there's a .lib file automatically linked into ARM64 libraries that does this emulation for you),
At least these are completely missing: https://github.com/microsoft/STL/issues/2635#issuecomment-1087093512
We should probably prefer NEON instructions here instead of emulated x86 SSE/AVX instructions
It really depends. At least in my observations running the emulated intrinsics can actually be faster if you're already in emulated code. (You're running x64 code on an ARM64 machine). ARM64EC lets you jump back and forth between ARM64 native code and emulated x64 code.
There's also a text file in the msvc linker directory that has a pretty good explanation of the symbol naming conventions and how symbols are selected.
src/vctools/Link/doc/Arm64ECLinking.md
I believe this MS docs page should answer a lot of questions: Understanding Arm64EC ABI and assembly code
Wow, that has a ton of useful information.
After reading http://emulators.com/docs/abc_arm64ec_explained.htm linked by @YexuanXiao in https://github.com/microsoft/STL/pull/5597#issuecomment-2976240542 (thanks!), I think I finally have a decent understanding of what ARM64EC is. The preprocessor macro scheme made it really confusing for me.