SeBa, MESA r2208 and GalactICS fail precision tests on ARM processors
When running on ARM processors (Apple Silicon and also Raspberry Pi), SeBa, MESA r2208 and GalactICS fail some of the tests because of what seems to be a numerical precision issue. GitHub Actions runs:
GalactICS: https://github.com/LourensVeen/amuse/actions/runs/15856287284 SeBa: https://github.com/LourensVeen/amuse/actions/runs/15846565565 MESA r2208: https://github.com/LourensVeen/amuse/actions/runs/15846565519
These work on my AMD Zen laptop with Linux and both GCC and Clang, and on GitHub actions (Intel?) with gcc. Also SeBa is in C++, and GalactICS and MESA are in Fortran, so it doesn't seem to be a compiler or OS issue.
The one thing we can think of at the moment is that it might be the weird 80-bit internal long double that x86 FPUs have and ARM FPUs don't. There's a compiler switch to use only SSE on x86, which doesn't do more than 64 bits, so we can try that to see if that triggers the problem on x86 as well.
The SeBa tests also fail on an x86 macOS installation... So it's not (just) ARM that's the issue.
I've now tested SeBa on intel and arm versions of macOS and Linux, and all of these fail on the same tests (and arm/linux on raspberry pi on one extra test). I will try the same for galactics and mesa-2208 next, but so far I cannot reproduce this issue.
A relevant snippet from the failing GalactICS test:
if platform.processor() == 'ppc64le':
# on ppc64le, the model generation has small differences from intel
# change expected pos
expected_mean_pos = numpy.array([73.5628, 76.251034, 75.53434])
else:
expected_mean_pos = numpy.array([73.768384103536604, 76.03533643054962, 75.176319462463255])
this is also where there is a difference on macOS, as the result here is
In [21]: numpy.array([numpy.mean(abs(x_positions)), numpy.mean(abs(y_positions)),
...: numpy.mean(abs(z_positions))])
Out[21]: array([ 73.80516423, 76.07324579, 75.21380085])
so, this may not be unexpected?
Though this is a difference between OSes rather than processors... (I can now reproduce the differences in failed tests for MESA and GalactICS between macOS and Linux)
to me, it looks like a difference in a random number generator rather than precision, maybe.
Trying to summarise what we currently know:
| Scenario | CPU | OS | Compiler | Branch | Env | SeBa | GalactICS | MESA r2208 |
|---|---|---|---|---|---|---|---|---|
| GitHub Actions Intel | Intel? x86 | Ubuntu | g++/gfortran 13.3.0 | issue-1144 | Conda | Pass | Pass | Pass |
| GitHub Actions macOS | Apple Silicon | macOS | clang/gfortran 13.3.0 | issue-1144 | Conda | Fail | Fail | Fail |
| Lourens' laptop | AMD x86 | Linux | g++/gfortran 13.3.0 | issue-1144 | Conda | Pass | Pass | Pass |
| Lourens' laptop clang | AMD x86 | Linux | clang/gfortran 13.3.0 | issue-1144 | Conda | Pass | Pass | Pass |
| Steven's mac | Apple Silicon | macOS | clang/gfortran? | main | Conda | Fail | Fail | Fail |
| Steven's Raspberry Pi | ARM | Linux | g++/gfortran? | main | Conda? | Fail +1 | ? | ? |
| Steven Intel mac | Intel | macOS | clang/gfortran | main | Conda | Fail | ? | ? |
| Steven Intel Linux | Intel | Linux | g++/gfortran? | main | venv | Fail | ? | ? |