openexr icon indicating copy to clipboard operation
openexr copied to clipboard

testDWAACompression and testDWABCompression fail on aarch64, i686

Open fweimer-rh opened this issue 2 years ago • 12 comments

The aarch64 failure looks like this:

 56/113 Test  #50: OpenEXRCore.testDWAACompression ...........Subprocess aborted***Exception:   2.48 sec
tempDir = '/var/tmp/OpenEXR_1wG2Kb/': 24
=======
Running testDWAACompression
  zeroes tiled: no sampling 1, 1 comp 8
File sizes do not match: '/var/tmp/OpenEXR_1wG2Kb/zeroes_imf_test_comp.exr' 3602 '/var/tmp/OpenEXR_1wG2Kb/zeroes_imf_test_comp_cpp.exr' 3707
  zeroes tiled: yes sampling 1, 1 comp 8
Files '/var/tmp/OpenEXR_1wG2Kb/zeroes_imf_test_comp.exr' and '/var/tmp/OpenEXR_1wG2Kb/zeroes_imf_test_comp_cpp.exr' differ in chunk starting at 411
  zeroes tiled: no sampling 1, 2 comp 8
File sizes do not match: '/var/tmp/OpenEXR_1wG2Kb/zeroes_imf_test_comp.exr' 4592 '/var/tmp/OpenEXR_1wG2Kb/zeroes_imf_test_comp_cpp.exr' 4692
  zeroes tiled: no sampling 2, 1 comp 8
File sizes do not match: '/var/tmp/OpenEXR_1wG2Kb/zeroes_imf_test_comp.exr' 3602 '/var/tmp/OpenEXR_1wG2Kb/zeroes_imf_test_comp_cpp.exr' 3707
  zeroes tiled: no sampling 2, 2 comp 8
File sizes do not match: '/var/tmp/OpenEXR_1wG2Kb/zeroes_imf_test_comp.exr' 4592 '/var/tmp/OpenEXR_1wG2Kb/zeroes_imf_test_comp_cpp.exr' 4692
  pattern1 tiled: no sampling 1, 1 comp 8
File sizes do not match: '/var/tmp/OpenEXR_1wG2Kb/pattern1_imf_test_comp.exr' 7640 '/var/tmp/OpenEXR_1wG2Kb/pattern1_imf_test_comp_cpp.exr' 70350
  pattern1 tiled: yes sampling 1, 1 comp 8
File sizes do not match: '/var/tmp/OpenEXR_1wG2Kb/pattern1_imf_test_comp.exr' 88311 '/var/tmp/OpenEXR_1wG2Kb/pattern1_imf_test_comp_cpp.exr' 88096
  pattern1 tiled: no sampling 1, 2 comp 8
File sizes do not match: '/var/tmp/OpenEXR_1wG2Kb/pattern1_imf_test_comp.exr' 9690 '/var/tmp/OpenEXR_1wG2Kb/pattern1_imf_test_comp_cpp.exr' 73368
  pattern1 tiled: no sampling 2, 1 comp 8
File sizes do not match: '/var/tmp/OpenEXR_1wG2Kb/pattern1_imf_test_comp.exr' 7640 '/var/tmp/OpenEXR_1wG2Kb/pattern1_imf_test_comp_cpp.exr' 70350
  pattern1 tiled: no sampling 2, 2 comp 8
File sizes do not match: '/var/tmp/OpenEXR_1wG2Kb/pattern1_imf_test_comp.exr' 9690 '/var/tmp/OpenEXR_1wG2Kb/pattern1_imf_test_comp_cpp.exr' 73368
  pattern2 tiled: no sampling 1, 1 comp 8
File sizes do not match: '/var/tmp/OpenEXR_1wG2Kb/pattern2_imf_test_comp.exr' 1890660 '/var/tmp/OpenEXR_1wG2Kb/pattern2_imf_test_comp_cpp.exr' 1745228
B half at 1001, 40 not equal: C loaded C++ 0x3195 (0.174438) vs C loaded C 0x3190 (0.173828)
Core Test failed: a == b
           file:/builddir/build/BUILD/openexr-3.1.8/src/test/OpenEXRCoreTest/compression.cpp
           line:475
       function:static void pixels::compareExact(uint16_t, uint16_t, int, int, const char*, const char*, const char*)
        Start  64: OpenEXR.testCopyDeepTiled
 57/113 Test  #51: OpenEXRCore.testDWABCompression ...........Subprocess aborted***Exception:   3.03 sec
tempDir = '/var/tmp/OpenEXR_RfGuk8/': 24
=======
Running testDWABCompression
  zeroes tiled: no sampling 1, 1 comp 9
File sizes do not match: '/var/tmp/OpenEXR_RfGuk8/zeroes_imf_test_comp.exr' 2791 '/var/tmp/OpenEXR_RfGuk8/zeroes_imf_test_comp_cpp.exr' 2861
  zeroes tiled: yes sampling 1, 1 comp 9
Files '/var/tmp/OpenEXR_RfGuk8/zeroes_imf_test_comp.exr' and '/var/tmp/OpenEXR_RfGuk8/zeroes_imf_test_comp_cpp.exr' differ in chunk starting at 411
  zeroes tiled: no sampling 1, 2 comp 9
File sizes do not match: '/var/tmp/OpenEXR_RfGuk8/zeroes_imf_test_comp.exr' 3001 '/var/tmp/OpenEXR_RfGuk8/zeroes_imf_test_comp_cpp.exr' 3071
  zeroes tiled: no sampling 2, 1 comp 9
File sizes do not match: '/var/tmp/OpenEXR_RfGuk8/zeroes_imf_test_comp.exr' 2791 '/var/tmp/OpenEXR_RfGuk8/zeroes_imf_test_comp_cpp.exr' 2861
  zeroes tiled: no sampling 2, 2 comp 9
File sizes do not match: '/var/tmp/OpenEXR_RfGuk8/zeroes_imf_test_comp.exr' 3001 '/var/tmp/OpenEXR_RfGuk8/zeroes_imf_test_comp_cpp.exr' 3071
  pattern1 tiled: no sampling 1, 1 comp 9
File sizes do not match: '/var/tmp/OpenEXR_RfGuk8/pattern1_imf_test_comp.exr' 68817 '/var/tmp/OpenEXR_RfGuk8/pattern1_imf_test_comp_cpp.exr' 68889
  pattern1 tiled: yes sampling 1, 1 comp 9
File sizes do not match: '/var/tmp/OpenEXR_RfGuk8/pattern1_imf_test_comp.exr' 234006 '/var/tmp/OpenEXR_RfGuk8/pattern1_imf_test_comp_cpp.exr' 233791
  pattern1 tiled: no sampling 1, 2 comp 9
File sizes do not match: '/var/tmp/OpenEXR_RfGuk8/pattern1_imf_test_comp.exr' 67973 '/var/tmp/OpenEXR_RfGuk8/pattern1_imf_test_comp_cpp.exr' 68043
  pattern1 tiled: no sampling 2, 1 comp 9
File sizes do not match: '/var/tmp/OpenEXR_RfGuk8/pattern1_imf_test_comp.exr' 68817 '/var/tmp/OpenEXR_RfGuk8/pattern1_imf_test_comp_cpp.exr' 68889
  pattern1 tiled: no sampling 2, 2 comp 9
File sizes do not match: '/var/tmp/OpenEXR_RfGuk8/pattern1_imf_test_comp.exr' 67973 '/var/tmp/OpenEXR_RfGuk8/pattern1_imf_test_comp_cpp.exr' 68043
  pattern2 tiled: no sampling 1, 1 comp 9
File sizes do not match: '/var/tmp/OpenEXR_RfGuk8/pattern2_imf_test_comp.exr' 1707649 '/var/tmp/OpenEXR_RfGuk8/pattern2_imf_test_comp_cpp.exr' 1707912
B half at 1001, 40 not equal: C loaded C++ 0x3195 (0.174438) vs C loaded C 0x3190 (0.173828)
Core Test failed: a == b
           file:/builddir/build/BUILD/openexr-3.1.8/src/test/OpenEXRCoreTest/compression.cpp
           line:475
       function:static void pixels::compareExact(uint16_t, uint16_t, int, int, const char*, const char*, const char*)

The i686 failure looks similar, the difference being:

-File sizes do not match: '/var/tmp/OpenEXR/pattern2_imf_test_comp.exr' 1707649 '/var/tmp/OpenEXR/pattern2_imf_test_comp_cpp.exr' 1707912
-B half at 1001, 40 not equal: C loaded C++ 0x3195 (0.174438) vs C loaded C 0x3190 (0.173828)
+File sizes do not match: '/var/tmp/OpenEXR/pattern2_imf_test_comp.exr' 1707645 '/var/tmp/OpenEXR/pattern2_imf_test_comp_cpp.exr' 1707909
+B half at 1276, 1 not equal: C++ loaded C 0x3c4b (1.07324) vs C loaded C 0x3c4e (1.07617)

Seen with 3.1.8.

fweimer-rh avatar Jun 19 '23 11:06 fweimer-rh

Seen with 3.1.8.

I confirm, 3.1.7 was fine on aarch64 and breaks with 3.1.8.

ggardet avatar Jun 23 '23 08:06 ggardet

@kdt3rd, it looks like the test that's failing was introduced in 3.1.8 as a part of the expanded DWA support, so it's something deeper than a simple regression.

cary-ilm avatar Jun 23 '23 17:06 cary-ilm

yes, this will be in the neon code ported into the core library (or ifdefs mismatching in doing so)

kthurston avatar Jun 27 '23 00:06 kthurston

Thanks, I see this as well, namely for ppc64le, aarch64 and arm7l architectures. Details: https://build.opensuse.org/package/show/graphics/openexr

pgajdos avatar Jun 28 '23 07:06 pgajdos

@fweimer-rh is it possible that you are compiling with custom CXXFLAGS for architecture choices and now the Core library requires similar changes to using CFLAGS (it being pure C, not C++)? What I thought might be an issue does not appear to be an obvious issue, except I was seeing errors like that when one library is using SIMD extensions and the other isn't (they don't produce identical numerical results then). This is temporary until we gain confidence that the two are the same in all cases, and then we'll use the C implementation from the C++ layer.

kdt3rd avatar Jul 10 '23 09:07 kdt3rd

@kdt3rd We haven't changed our build flags on aarch64 or i686, so that's probably not really the issue here. Our i686 variant does not even have FMA (but aarch64 does).

fweimer-rh avatar Jul 10 '23 09:07 fweimer-rh

It seems these tests pass only on x86_64. With 3.1.9, these tests fail on i686, aarch64, ppc64le, and s390x (tested with Fedora rawhide).

yselkowitz avatar Jul 10 '23 23:07 yselkowitz

If we can confirm the problem is the test and not an actual arch issue I can skip tests on non x86_64 arches.

hobbes1069 avatar Jul 10 '23 23:07 hobbes1069

It would appear that using pre-computed values result in slightly different results on non-sse architectures. I've reverted that change for now in #1488 (will let it test on main, then pick onto the 3.1 release branch)

kdt3rd avatar Jul 14 '23 22:07 kdt3rd

I backported #1488 on top of 3.1.9 and it fixes most architectures (armv7, ppc64le, etc.) but aarch64 still fails:

[  112s]  58/113 Test  #50: OpenEXRCore.testDWAACompression ...........Subprocess aborted***Exception:   2.34 sec
[  112s] tempDir = '/var/tmp/OpenEXR_0hzGw8/': 24
[  112s] 
[  112s] =======
[  112s] Running testDWAACompression
[  112s]   zeroes tiled: no sampling 1, 1 comp 8
[  112s] File sizes do not match: '/var/tmp/OpenEXR_0hzGw8/zeroes_imf_test_comp.exr' 3602 '/var/tmp/OpenEXR_0hzGw8/zeroes_imf_test_comp_cpp.exr' 3707
[  112s]   zeroes tiled: yes sampling 1, 1 comp 8
[  112s] Files '/var/tmp/OpenEXR_0hzGw8/zeroes_imf_test_comp.exr' and '/var/tmp/OpenEXR_0hzGw8/zeroes_imf_test_comp_cpp.exr' differ in chunk starting at 411
[  112s]   zeroes tiled: no sampling 1, 2 comp 8
[  112s] File sizes do not match: '/var/tmp/OpenEXR_0hzGw8/zeroes_imf_test_comp.exr' 4592 '/var/tmp/OpenEXR_0hzGw8/zeroes_imf_test_comp_cpp.exr' 4692
[  112s]   zeroes tiled: no sampling 2, 1 comp 8
[  112s] File sizes do not match: '/var/tmp/OpenEXR_0hzGw8/zeroes_imf_test_comp.exr' 3602 '/var/tmp/OpenEXR_0hzGw8/zeroes_imf_test_comp_cpp.exr' 3707
[  112s]   zeroes tiled: no sampling 2, 2 comp 8
[  112s] File sizes do not match: '/var/tmp/OpenEXR_0hzGw8/zeroes_imf_test_comp.exr' 4592 '/var/tmp/OpenEXR_0hzGw8/zeroes_imf_test_comp_cpp.exr' 4692
[  112s]   pattern1 tiled: no sampling 1, 1 comp 8
[  112s] File sizes do not match: '/var/tmp/OpenEXR_0hzGw8/pattern1_imf_test_comp.exr' 7640 '/var/tmp/OpenEXR_0hzGw8/pattern1_imf_test_comp_cpp.exr' 70350
[  112s]   pattern1 tiled: yes sampling 1, 1 comp 8
[  112s] File sizes do not match: '/var/tmp/OpenEXR_0hzGw8/pattern1_imf_test_comp.exr' 88311 '/var/tmp/OpenEXR_0hzGw8/pattern1_imf_test_comp_cpp.exr' 88096
[  112s]   pattern1 tiled: no sampling 1, 2 comp 8
[  112s] File sizes do not match: '/var/tmp/OpenEXR_0hzGw8/pattern1_imf_test_comp.exr' 9690 '/var/tmp/OpenEXR_0hzGw8/pattern1_imf_test_comp_cpp.exr' 73368
[  112s]   pattern1 tiled: no sampling 2, 1 comp 8
[  112s] File sizes do not match: '/var/tmp/OpenEXR_0hzGw8/pattern1_imf_test_comp.exr' 7640 '/var/tmp/OpenEXR_0hzGw8/pattern1_imf_test_comp_cpp.exr' 70350
[  112s]   pattern1 tiled: no sampling 2, 2 comp 8
[  112s] File sizes do not match: '/var/tmp/OpenEXR_0hzGw8/pattern1_imf_test_comp.exr' 9690 '/var/tmp/OpenEXR_0hzGw8/pattern1_imf_test_comp_cpp.exr' 73368
[  112s]   pattern2 tiled: no sampling 1, 1 comp 8
[  112s] File sizes do not match: '/var/tmp/OpenEXR_0hzGw8/pattern2_imf_test_comp.exr' 1890661 '/var/tmp/OpenEXR_0hzGw8/pattern2_imf_test_comp_cpp.exr' 1745228
[  112s] R half at 1353, 8 not equal: C++ loaded C 0x2e2f (0.0966187) vs C loaded C 0x2e2d (0.0964966)
[  112s] Core Test failed: a == b
[  112s]            file:/home/abuild/rpmbuild/BUILD/openexr-3.1.9/src/test/OpenEXRCoreTest/compression.cpp
[  112s]            line:475
[  112s]        function:static void pixels::compareExact(uint16_t, uint16_t, int, int, const char*, const char*, const char*)
[  112s] 
[  112s]         Start  62: OpenEXR.testConversion
[  114s]  59/113 Test  #51: OpenEXRCore.testDWABCompression ...........Subprocess aborted***Exception:   2.85 sec
[  114s] tempDir = '/var/tmp/OpenEXR_Hi3fZx/': 24
[  114s] 
[  114s] =======
[  114s] Running testDWABCompression
[  114s]   zeroes tiled: no sampling 1, 1 comp 9
[  114s] File sizes do not match: '/var/tmp/OpenEXR_Hi3fZx/zeroes_imf_test_comp.exr' 2791 '/var/tmp/OpenEXR_Hi3fZx/zeroes_imf_test_comp_cpp.exr' 2861
[  114s]   zeroes tiled: yes sampling 1, 1 comp 9
[  114s] Files '/var/tmp/OpenEXR_Hi3fZx/zeroes_imf_test_comp.exr' and '/var/tmp/OpenEXR_Hi3fZx/zeroes_imf_test_comp_cpp.exr' differ in chunk starting at 411
[  114s]   zeroes tiled: no sampling 1, 2 comp 9
[  114s] File sizes do not match: '/var/tmp/OpenEXR_Hi3fZx/zeroes_imf_test_comp.exr' 3001 '/var/tmp/OpenEXR_Hi3fZx/zeroes_imf_test_comp_cpp.exr' 3071
[  114s]   zeroes tiled: no sampling 2, 1 comp 9
[  114s] File sizes do not match: '/var/tmp/OpenEXR_Hi3fZx/zeroes_imf_test_comp.exr' 2791 '/var/tmp/OpenEXR_Hi3fZx/zeroes_imf_test_comp_cpp.exr' 2861
[  114s]   zeroes tiled: no sampling 2, 2 comp 9
[  114s] File sizes do not match: '/var/tmp/OpenEXR_Hi3fZx/zeroes_imf_test_comp.exr' 3001 '/var/tmp/OpenEXR_Hi3fZx/zeroes_imf_test_comp_cpp.exr' 3071
[  114s]   pattern1 tiled: no sampling 1, 1 comp 9
[  114s] File sizes do not match: '/var/tmp/OpenEXR_Hi3fZx/pattern1_imf_test_comp.exr' 68817 '/var/tmp/OpenEXR_Hi3fZx/pattern1_imf_test_comp_cpp.exr' 68889
[  114s]   pattern1 tiled: yes sampling 1, 1 comp 9
[  114s] File sizes do not match: '/var/tmp/OpenEXR_Hi3fZx/pattern1_imf_test_comp.exr' 234006 '/var/tmp/OpenEXR_Hi3fZx/pattern1_imf_test_comp_cpp.exr' 233791
[  114s]   pattern1 tiled: no sampling 1, 2 comp 9
[  114s] File sizes do not match: '/var/tmp/OpenEXR_Hi3fZx/pattern1_imf_test_comp.exr' 67973 '/var/tmp/OpenEXR_Hi3fZx/pattern1_imf_test_comp_cpp.exr' 68043
[  114s]   pattern1 tiled: no sampling 2, 1 comp 9
[  114s] File sizes do not match: '/var/tmp/OpenEXR_Hi3fZx/pattern1_imf_test_comp.exr' 68817 '/var/tmp/OpenEXR_Hi3fZx/pattern1_imf_test_comp_cpp.exr' 68889
[  114s]   pattern1 tiled: no sampling 2, 2 comp 9
[  114s] File sizes do not match: '/var/tmp/OpenEXR_Hi3fZx/pattern1_imf_test_comp.exr' 67973 '/var/tmp/OpenEXR_Hi3fZx/pattern1_imf_test_comp_cpp.exr' 68043
[  114s]   pattern2 tiled: no sampling 1, 1 comp 9
[  114s] File sizes do not match: '/var/tmp/OpenEXR_Hi3fZx/pattern2_imf_test_comp.exr' 1707650 '/var/tmp/OpenEXR_Hi3fZx/pattern2_imf_test_comp_cpp.exr' 1707912
[  114s] R half at 1353, 8 not equal: C++ loaded C 0x2e2f (0.0966187) vs C loaded C 0x2e2d (0.0964966)
[  114s] Core Test failed: a == b
[  114s]            file:/home/abuild/rpmbuild/BUILD/openexr-3.1.9/src/test/OpenEXRCoreTest/compression.cpp
[  114s]            line:475
[  114s]        function:static void pixels::compareExact(uint16_t, uint16_t, int, int, const char*, const char*, const char*)

ggardet avatar Jul 17 '23 14:07 ggardet

Closed inadvertently. #1488 fixes part of the problem, but there is still a failure on aarch64, right?

cary-ilm avatar Jul 23 '23 22:07 cary-ilm

Closed inadvertently. #1488 fixes part of the problem, but there is still a failure on aarch64, right?

That's correct. Aarch64 has still issues.

ggardet avatar Jul 24 '23 06:07 ggardet