o2
o2 copied to clipboard
`-mcx16` is only available on x86 CPUs
on non-macOS, o2 is unconditionally compiled with the -mcx16 flag, as can be seen in https://github.com/rbdannenberg/o2/blob/253d109f4a2eaf62a885663ca50a34a6f04dbd74/CMakeLists.txt#L361
however, this flag is only available for x86 CPUs (32bit, 64bit), and not for other archs.
I discovered this when compiling for s390x (which is admittedly an odd target architecture for o2), but the problem persists with common architectures like armhf (as found e.g. in the RaspberryPi) and arm64 (RPi4,...)
Does the flag cause a serious problem? I'm not sure how to add the flag conditioned on the target architecture. As noted elsewhere, there might be a bigger issue with atomics on ARM (-mcx16 enables code generation used for atomic list operations in O2).
Does the flag cause a serious problem?
if "serious" means "does not compile" then: yes.
$ touch foo.c
$ gcc foo.c -mcx16 -o foo.o
gcc: error: unrecognized command-line option '-mcx16'
$ echo $?
1
$ uname -a
Linux amdahl 5.10.0-18-arm64 #1 SMP Debian 5.10.140-1 (2022-09-02) aarch64 GNU/Linux
$
i think it would be nice if o2 could work on RaspberriPi and Apple M1.
Yes, I agree: RaspberriPi and Apple M1. All my regression tests run on Apple M2, so I assume M1 as well (but it would be nice to confirm). If you mean 32-bit RaspberriPi, then the starting point is lock-free lists. libatomic is not guaranteed to be lock free, but I don't know the Raspberry Pi situation for sure. EDIT: Upon further investigation, it appears that Raspberry Pi can do lock-free lists, e.g. if not libatomic, then mintomic looks good. Also, I think the 32-bit problem might just be that O2 assumes 64-bit words and 8-byte alignment of some things, but those could be enforced on 32-bit architectures for simplicity and it should all work. I am planning to buy a Raspberry Pi Zero W, figuring newer, bigger versions will be backward compatible. Any advice is welcome.
Should I add an explicit option called ARM_ARCHITECTURE that will inhibit the -mcx16 flag? I don't know a better way to do this.
I think it's better to not mention the "arm" architecture, as this is really about non-x86. (According to the GCC docs, '-mcx16' is a flag that is only valid on this single architecture)
Yes, I agree: RaspberriPi and Apple M1
just to be clear about here: i'm of course (mostly) talking about Linux here. i threw in the "Apple M1" as an obvious bait.
Currently building for macOS/arm64 (that is, the typical "Apple M1" setup), will work fine, simply because the -mcx16 flag is omitted on this platform (presumably because the alignment is correct on this platform anyhow), which is basically hiding the issue.
I am planning to buy a Raspberry Pi Zero W
that sounds like a good plan.
FWIW there's CMAKE_SYSTEM_PROCESSOR which is not really platform agnostic, but probably set to x86_64 for Linux targets where -mcx16 is appropriate.
I'm running O2 on Apple M1 now. Everything seems to be OK wrt -mcx16
i haven't done any recent checks but when you say
I'm running O2 on Apple M1 now. Everything seems to be OK wrt -mcx16
..., how does this relate to my
Currently building for macOS/arm64 (that is, the typical "Apple M1" setup), will work fine, simply because the -mcx16 flag is omitted on this platform
?
that is: i think we had already established that macOS/arm64 works fine. the problem is with Linux/non-x86
should i update the title accordingly, or just open a new issue?
Sorry about the cryptic reply earlier. I'm not sure when I made the changes wrt the posts here, but I revised the code to do 16-byte alignment everywhere and even on a 32-bit Raspberry Pi, and I have O2 running on the Raspberry Pi. It's possible there's still a problem in CMake files because -mcx16 is apparently always set for Linux. There's a comment/hint that should appear in CMake under ARCHITECTURE_C_FLAGS that says "-mcx16 flag is required for x64 builds, but you should clear this for other architectures." Maybe this should be more automatic, but it also gets confusing when you are cross-compiling, e.g. I do builds for x86_64 on my Mac M2 laptop.