test_elmhes and test_formats fail on non-x86_64
Working on updating the Fedora package to 1.0.5 and getting:
Start 82: test_elmhes.pro
82: Test command: /builddir/build/BUILD/gdl-v1.0.5/build/src/gdl "-quiet" "-e" "if execute('test_elmhes') ne 1 then exit, status=1"
82: Working Directory: /builddir/build/BUILD/gdl-v1.0.5/build/testsuite
82: Environment variables:
82: LC_COLLATE=C
82: GDL_PATH=/builddir/build/BUILD/gdl-v1.0.5/testsuite/:/builddir/build/BUILD/gdl-v1.0.5/src/pro/
82: GDL_STARTUP=
82: IDL_STARTUP=
82: Test timeout computed to be: 3600
82: % Compiled module: TEST_ELMHES.
82: % Compiled module: ERRORS_ADD.
82: % TEST_ELMHES: Error on operation : bad result elmhes
82: % TEST_ELMHES: Error on operation : bad result elmhes,/no_balance
82: % TEST_ELMHES: Error on operation : bad result elmhes,/column
82: % Compiled module: BANNER_FOR_TESTSUITE.
82: % Compiled module: GDL_IDL_FL.
82: % TEST_ELMHES: ===================================================
82: % TEST_ELMHES: = =
82: % TEST_ELMHES: = 3 errors encountered during TEST_ELMHES tests =
82: % TEST_ELMHES: = =
82: % TEST_ELMHES: ===================================================
82/212 Test #82: test_elmhes.pro ....................***Failed 0.17 sec
Start 100: test_formats.pro
100: Test command: /builddir/build/BUILD/gdl-v1.0.5/build/src/gdl "-quiet" "-e" "if execute('test_formats') ne 1 then exit, status=1"
100: Working Directory: /builddir/build/BUILD/gdl-v1.0.5/build/testsuite
100: Environment variables:
100: LC_COLLATE=C
100: GDL_PATH=/builddir/build/BUILD/gdl-v1.0.5/testsuite/:/builddir/build/BUILD/gdl-v1.0.5/src/pro/
100: GDL_STARTUP=
100: IDL_STARTUP=
100: Test timeout computed to be: 3600
100: % Compiled module: TEST_FORMATS.
100: % Compiled module: GDL_IDL_FL.
100: % GDL_IDL_FL: Detected Software : GDL
100: % When using the RAN1 mode, be sure to keep the RAN1 and dSFMT seed arrays in separate variables.
100: multiple reference file <<formats.GDL>> found ! First used !!
100: /builddir/build/BUILD/gdl-v1.0.5/build/testsuite/formats.GDL
100: /builddir/build/BUILD/gdl-v1.0.5/testsuite/formats.GDL
100: Files to be compared : formats.IDL, formats.GDL
100: % Compiled module: BANNER_FOR_TESTSUITE.
100: % TEST_FORMATS: =======================================================
100: % TEST_FORMATS: = =
100: % TEST_FORMATS: = 1595 errors encountered during TEST_FORMATS tests =
100: % TEST_FORMATS: = =
100: % TEST_FORMATS: =======================================================
100/212 Test #100: test_formats.pro ...................***Failed 0.65 sec
Thanks @opoplawski
Looking in the code of test_elmhes.pro, due to the way the tests are done internally,
I think these 2 failures (test_elmhes & test_formats) are related to issue in formats :(
I have no way to test on my side on a recent Fedora, and I have no problem on Debian, Ubuntu & OSX !
What is the version of compiler do you have ?
thanks
This is with gcc 14.1.1. But it's also failing on EL9 with 11.4.1. You can check recent build logs here: https://koji.fedoraproject.org/koji/packageinfo?packageID=1830
I'm pretty sure formats won't be OK on non 64 bits machines. So some tests based on formatted string comparison won't work either. The thing is, nobody in the team knows what GDL should produce on 32 bit machines! I would suggest to avoid doing these tests on 32 bit machines, as they do not mean that GDL does not work. And wait for an user that reports a specific issue on 32 bit machine.
These are all 64 bit architectures - aarch64, ppc64le, s390x
These are all 64 bit architectures - aarch64, ppc64le, s390x
@opoplawski sorry but your issue refers to "non-x86_64" architectures. My above comment holds: better to remove theses tests from the list of tests in "non-x86_64" architectures building as they are meaningless.
I was just responding to your comment about 32-bits. But if the tests only apply to x86_64 that's fine. Although it would be nice if the tests could deselect themselves on non-x86_64. Anyway, I'm excluding them now.
thanks @opoplawski but I feel there is a misunderstanding: according to internet, s390x is a 32 bit machine when aarch64 is not. Inasmuch as I expect trouble on 32 bit machines, as we have no such machine with a working IDL at our disposal to crosscheck, there should be no problem on a 64 bit little or big endian IEEE 754 architectures. So your report of a test failure is important in this case.
s390x is definitely a 64 bit architecture: https://developer.fedoraproject.org/deployment/secondary_architectures/s390.html. s390 is 31/32 bit hybrid. I'll reopen then I guess. Let me know what other information would be helpful for tracking this down.
@opoplawski, do I understand correctly that the tests pass OK on Fedora arm64 builds?
In #1788, we are introducing Apple Silicon builds to CI, but the PR is blocked by two tests failing: test_byte_conversion.pro and test_bytscl.pro; if that is the case, it then seems to be an Apple compiler issue?
To go further, one needs at least to know what fails - 1595 errors on test_format: I gues every format is wrong. The test procedure creates a file "formats.GDL". @opoplawski could you send it? For AppleSilicon, I have access to an M1, just need to find the time.
OK, I just compiled current git version on a new M2 machine (OSX) and I have the same issues :
test_elmhes.pro and test_formats.pro (I will look at test_formats later !)
On x86 processor, IDL & GDL give (first test) :
P DOUBLE = -2.8958759e-07
PT STRING = '-00.00000029'
ST STRING = '101.32080078'
T FLOAT = 101.321
GDL> print, b
0.500000 11.4800 5.50000 5.00000
6.25000 30.2200 20.7500 14.5000
0.680000 3.02080 1.28000 1.28000
0.360000 0.500000 0.00000 0.00000
But on M2:
P DOUBLE = 0.0000000
PT STRING = '000.00000000'
ST STRING = '101.32079315'
T FLOAT = 101.321
GDL> print, b
0.500000 11.4800 5.50000 5.00000
6.25000 30.2200 20.7500 14.5000
0.680000 3.02080 1.28000 1.28000
0.360000 0.500000 0.00000 0.00000
Then from my point of view just numerical rounding and the test should be rewritten taking into account EPS
Certainly. The cumulative rounding errors make our results different between machines, and, most of all, different with IDL that does not use the same algorithms. The difficulty is to fix a safe error margin, as precisions can well drop down to 10-3 for floats.
I updated test_elmhes.pro in Pr #1840 with a numerical tolerance of 1e-5. For me it is close.
Concerning test_formats.pro, from what I see in the outputs, we do have a big/little indian problem ... It is a serious issue. The good news is I have now a permanent access to a M2 OSX machine (very fast feed. But Is have no time now, and I feel not competent on that. But maybe a simple flag could solve most of the problems. I hope @GillesDuvert will have time for that since he previously improved formats ...
The only differences are on unsigned 32 and bits ints and +/-NaN and +INF. I would not say it is an endianess problem.
see #1949 : some machines (ARM64) do not convert to unsigned ints as on Intel. NaN and INF issues in test_formats come from the fact that these floating-point pseudo-values are converted to unsigned integers (rather than bit fields?) before printing bits (to print we use C and C++ standards). #1949 should have suppressed the float-to-unsigned-int difference of conversion between IA64 and ARM64 (and others, probably). In other words: apart some NaN and Inf 'printing' problems, no more tested in test_formats, there should be no difference anymore.
Closing with the above explanation, dear Orion you can open a new issue if there is another 'portablity' problem.