cpu: intel: correct assumptions for x32 abi
The last I heard, nobody is using the x32 ABI in practice. Do we really need to support it?
We do do maintain x32 port at PLD Linux, so does Debian so it would be nice if ring built fine for this target. I'm not really familiar with the project, I'm only here because of maturin build failure (through rustls dependency). If there is any assembly involved I don't think it's worth spending time on supporting x32 in there. Is there some generic code path not involving assembly that could be taken for x32?
Either way, to support this target we'd need the coverage tests to be expanded to run for the x32 target too. I think that's the first step here. Especially, I am curious if GitHub Actions even supports an operating system that can run x32 binaries. You can try it out by copy/pasting all the x86_64-* stuff in the coverage job in ci.yml to create duplicate jobs for the gnux32 target.
If there is any assembly involved I don't think it's worth spending time on supporting x32 in there.
I didn't look at the details of the x32 ABI. If the ABI is defined such that pointers and size_t values are passed in 64-registers, zero-extended, then it can work fine without changing the assembly. My understanding is that x32 does zero-extend pointers passed in registers. If it doesn't zero-extend size_t values passed in registers then we'd need to audit our assembly code to verify that it always assumes zeroes the high half of every size/length parameter declared as usize/size_t/NonZero_size_t and/or that the size/length parameter is typed u64.
Is there some generic code path not involving assembly that could be taken for x32?
I don't think it is a good strategy for take the generic code path for x32 ABI. One thing we could do is change the CPU feature detection logic so that for the x32 ABI, no CPU features are detected. This will limit the amount of assembly code that would be used for the x32 ABI. But we'd still need to review bn_mul_mont and the SHA-256/SHA-512 assembly code (at least) that is used even when no CPU features are detected.
I took a step towards this in PR #2560.
I took a step towards this in PR #2560.
Also PTAL at #2561 which fixes the definition of the word size for AArch64 ILP32 and x86_64 x32 so that we continue to do 64-bit arithmetic.
I'm closing this PR because it doesn't really do what we need.
Ideally, for each assembly function, we'd have a shim that takes 32-bit arguments with garbage in the high half, clears the high half of each one, and calls the assembly. Or we'd change the FFI declaration so that every argument is a 64-bit argument, which would require new types that represent a 64-bit pointer, 64-bit usize, etc. Or, worst case, we'd change the dispatching so that we simply skip the assembly implementations for x32 and ilp32.