zlib-ng icon indicating copy to clipboard operation
zlib-ng copied to clipboard

2.1.2 seems to fail tests on ppc64le with musl libc

Open nekopsykose opened this issue 1 year ago • 53 comments

 1/67 Test  #1: example ..........................***Failed    0.19 sec
zlib-ng version 2.1.2 = 0x020102f0, compile flags = 0xa9
uncompress error: -3

      Start 65: makecrct
 2/67 Test  #2: infcover .........................   Passed    0.19 sec
      Start 66: makefixed
 3/67 Test  #3: CVE-2002-0059 ....................   Passed    0.19 sec
      Start 67: maketrees
 4/67 Test  #4: CVE-2004-0797 ....................   Passed    0.19 sec
 5/67 Test  #5: CVE-2005-1849 ....................   Passed    0.19 sec
 6/67 Test  #6: CVE-2005-2096 ....................   Passed    0.19 sec
 7/67 Test  #7: CVE-2018-25032-fixed-level-6 .....   Passed    0.19 sec
 8/67 Test  #8: CVE-2018-25032-default-level-6 ...***Failed    0.19 sec
-- Compress /home/buildozer/aports/community/zlib-ng/src/zlib-ng-2.1.2/build/minideflate;-c;-k;-m;1;-w;-15;-s;4;-6
--   Source file: /home/buildozer/aports/community/zlib-ng/src/zlib-ng-2.1.2/test/CVE-2018-25032/default.txt
--   Compression input file: /home/buildozer/aports/community/zlib-ng/src/zlib-ng-2.1.2/build/test/Testing/Temporary/default-txt-Czzxhg
--   Output: /home/buildozer/aports/community/zlib-ng/src/zlib-ng-2.1.2/build/test/Testing/Temporary/default-txt-Czzxhg.gz
-- Decompress /home/buildozer/aports/community/zlib-ng/src/zlib-ng-2.1.2/build/minideflate;-c;-k;-d;-m;1;-w;-15;-6
--   Input: /home/buildozer/aports/community/zlib-ng/src/zlib-ng-2.1.2/build/test/Testing/Temporary/default-txt-Czzxhg.gz
--   Output: /home/buildozer/aports/community/zlib-ng/src/zlib-ng-2.1.2/build/test/Testing/Temporary/default-txt-Czzxhg
-- Diff comparison
--   Input: /home/buildozer/aports/community/zlib-ng/src/zlib-ng-2.1.2/test/CVE-2018-25032/default.txt
--   Output: /home/buildozer/aports/community/zlib-ng/src/zlib-ng-2.1.2/build/test/Testing/Temporary/default-txt-Czzxhg
-- Diff:
--- /home/buildozer/aports/community/zlib-ng/src/zlib-ng-2.1.2/test/CVE-2018-25032/default.txt.hex
+++ /home/buildozer/aports/community/zlib-ng/src/zlib-ng-2.1.2/build/test/Testing/Temporary/default-txt-Czzxhg.hex
@@ -2302,7 +2302,7 @@
 00008fd0: 554f 434f 4f41 534d 484a 5844 5343 5943  UOCOOASMHJXDSCYC
 00008fe0: 4258 555a 4752 5450 4b4a 4a43 5343 434b  BXUZGRTPKJJCSCCK
 00008ff0: 4559 4b44 4f59 4f52 4a4d 444f 534e 5754  EYKDOYORJMDOSNWT
-00009000: 4a56 5755 4a4b 5359 5244 434d 4c56 4f47  JVWUJKSYRDCMLVOG
+00009000: 4a56 5755 4a4b 5359 4244 434d 4c56 4f47  JVWUJKSYBDCMLVOG
 00009010: 5047 4d59 4a45 5246 5142 4759 4e5a 544c  PGMYJERFQBGYNZTL
 00009020: 4159 5346 4e51 534f 514b 5344 4844 5541  AYSFNQSOQKSDHDUA
 00009030: 5942 494f 4756 4c52 4e43 5955 4651 4f42  YBIOGVLRNCYUFQOB
@@ -2310,7 +2310,7 @@
 00009050: 4f43 4f4f 4153 4d48 4a58 4453 4359 4342  OCOOASMHJXDSCYCB
 00009060: 5855 5a47 5254 504b 4a4a 4353 4343 4b45  XUZGRTPKJJCSCCKE
 00009070: 594b 444f 594f 524a 4d44 4f53 4e57 544a  YKDOYORJMDOSNWTJ
-00009080: 5657 554a 4b53 5942 5952 4443 4d4c 564f  VWUJKSYBYRDCMLVO
+00009080: 5657 554a 4b53 5942 5942 4443 4d4c 564f  VWUJKSYBYBDCMLVO
 00009090: 4750 474d 594a 4552 4651 4247 594e 5a54  GPGMYJERFQBGYNZT
 000090a0: 4c41 5953 464e 5153 4f51 4b53 4448 4455  LAYSFNQSOQKSDHDU
 000090b0: 4159 4249 4f47 564c 524e 4359 5546 514f  AYBIOGVLRNCYUFQO
@@ -2420,7 +2420,7 @@
 00009730: 4b4a 4a43 5343 434b 4559 4b44 4f59 4f52  KJJCSCCKEYKDOYOR
 00009740: 4a4d 444f 534e 5754 4a56 5755 4a4b 5359  JMDOSNWTJVWUJKSY
 00009750: 4251 574c 494d 4e56 5549 5944 4848 4543  BQWLIMNVUIYDHHEC
-00009760: 5047 4b56 4650 5246 5951 4952 544d 4f59  PGKVFPRFYQIRTMOY
+00009760: 5047 4b5a 4650 5246 5951 4952 544d 4f59  PGKZFPRFYQIRTMOY
 00009770: 5244 434d 4c56 4f47 5047 4d59 4a45 5246  RDCMLVOGPGMYJERF
 00009780: 5142 4759 4e5a 544c 4159 5346 4e51 534f  QBGYNZTLAYSFNQSO
 00009790: 514b 5344 4844 5541 5942 494f 4756 4c52  QKSDHDUAYBIOGVLR
@@ -2712,7 +2712,7 @@
 0000a970: 4841 4c43 515a 4149 5752 5044 4159 474e  HALCQZAIWRPDAYGN
 0000a980: 5a44 4b4d 5449 4645 4a48 5154 454d 4959  ZDKMTIFEJHQTEMIY
 0000a990: 5641 525a 4a52 5755 4444 584b 4d47 5555  VARZJRWUDDXKMGUU
-0000a9a0: 5142 5748 5a4e 5743 4a55 4742 4d57 5552  QBWHZNWCJUGBMWUR
+0000a9a0: 5142 5748 5a4e 5743 4a46 4742 4d57 5552  QBWHZNWCJFGBMWUR
 0000a9b0: 4344 4b48 4645 4542 4c52 5644 5546 4f41  CDKHFEEBLRVDUFOA
 0000a9c0: 5241 4f46 4b58 4645 5359 4e59 524b 534d  RAOFKXFESYNYRKSM
 0000a9d0: 4c42 5551 5954 434a 534c 4849 4847 4d50  LBUQYTCJSLHIHGMP
@@ -2726,7 +2726,7 @@
 0000aa50: 515a 4149 5752 5044 4159 474e 5a44 4b4d  QZAIWRPDAYGNZDKM
 0000aa60: 5449 4645 4a48 5154 454d 4959 5641 525a  TIFEJHQTEMIYVARZ
 0000aa70: 4a52 5755 4444 584b 4d47 5555 5142 5748  JRWUDDXKMGUUQBWH
-0000aa80: 5a4e 5743 4a46 4351 5454 4855 4742 4d57  ZNWCJFCQTTHUGBMW
+0000aa80: 5a4e 5743 4a46 4351 5454 4846 4742 4d57  ZNWCJFCQTTHFGBMW
 0000aa90: 5552 4344 4b48 4645 4542 4c52 5644 5546  URCDKHFEEBLRVDUF
 0000aaa0: 4f41 5241 4f46 4b58 4645 5359 4e59 524b  OARAOFKXFESYNYRK
 0000aab0: 534d 4c42 5551 5954 434a 534c 4849 4847  SMLBUQYTCJSLHIHG
@@ -2811,7 +2811,7 @@
 0000afa0: 4449 4841 4c43 515a 4149 5752 5044 4159  DIHALCQZAIWRPDAY
 0000afb0: 474e 5a44 4b4d 5449 4645 4a48 5154 454d  GNZDKMTIFEJHQTEM
 0000afc0: 4959 5641 525a 4a52 5755 4444 584b 4d47  IYVARZJRWUDDXKMG
-0000afd0: 5555 5142 5748 5a4e 5743 4a46 4351 5455  UUQBWHZNWCJFCQTU
+0000afd0: 5555 5142 5748 5a4e 5743 4a46 4351 5454  UUQBWHZNWCJFCQTT
 0000afe0: 545a 4141 5848 5547 424d 5755 5243 444b  TZAAXHUGBMWURCDK
 0000aff0: 4846 4545 424c 5256 4455 464f 4152 414f  HFEEBLRVDUFOARAO
 0000b000: 464b 5846 4553 594e 5952 4b53 4d4c 4255  FKXFESYNYRKSMLBU
@@ -2994,7 +2994,7 @@
 0000bb10: 554e 5951 4b45 4149 434d 464f 5959 4c44  UNYQKEAICMFOYYLD
 0000bb20: 524a 5158 434b 5155 494c 4249 5452 4843  RJQXCKQUILBITRHC
 0000bb30: 5553 514d 434e 5344 4443 4c4d 5952 5149  USQMCNSDDCLMYRQI
-0000bb40: 4944 4155 4b4e 5857 4659 5250 484b 4450  IDAUKNXWFYRPHKDP
+0000bb40: 4144 4155 4b4e 5857 4659 5250 484b 4450  ADAUKNXWFYRPHKDP
 0000bb50: 5748 4143 5147 5146 4c43 4454 5858 5a4b  WHACQGQFLCDTXXZK
 0000bb60: 5944 4744 4450 4f42 5357 454d 5041 4156  YDGDDPOBSWEMPAAV
 0000bb70: 4643 4558 514a 4443 424f 5a49 5253 4f57  FCEXQJDCBOZIRSOW

CMake Error at /home/buildozer/aports/community/zlib-ng/src/zlib-ng-2.1.2/test/cmake/compress-and-verify.cmake:195 (message):
  Compare decompress failed: 1


(...)

nekopsykose avatar Jun 09 '23 03:06 nekopsykose

It looks like one bit gets flipped... We test all supported non-Windows platforms with just GNU libc, so it might be possible compiling against musl toolchain reveals some broken functions.

mtl1979 avatar Jun 09 '23 08:06 mtl1979

@nekopsykose Did you test this without using musl as well? Just want to make sure musl is what is causing this.

Dead2 avatar Jun 09 '23 10:06 Dead2

i don't have access to a ppc64le machine without musl on it (or with a shell on it either, this is from a pipeline sadly). i could probably scrape together something but it would most likely be easier for you to compare and verify

nekopsykose avatar Jun 09 '23 10:06 nekopsykose

Ok, so probably musl, but it might be something else as well, thanks. It might also be a musl-on-ppc64le specific bug, in case someone tests with musl on x86* and it works fine.

I don't have access to any ppc* machines personally unfortunately. @mtl1979 did you have something like that?

Dead2 avatar Jun 09 '23 10:06 Dead2

it passes on every other architecture, yeah

nekopsykose avatar Jun 09 '23 10:06 nekopsykose

Ok, so probably musl, but it might be something else as well, thanks. It might also be a musl-on-ppc64le specific bug, in case someone tests with musl on x86* and it works fine.

I don't have access to any ppc* machines personally unfortunately. @mtl1979 did you have something like that?

I remember ppc64le and sparc64 having issues even with glibc (GNU libc), but that was with specific gcc versions. As much as I want to blame musl, it might be just that musl toolchain is based on broken commit of gcc toolchain. I tried same gcc version on two different Ubuntu versions and it failed only on the newer one.

mtl1979 avatar Jun 09 '23 12:06 mtl1979

ppc64le is a bit special in that VSX needs to perform a flip on loads because the memory load / store on certain sizes are natively big endian. Most of the time it doesn't matter but maybe we're looking at one of those times where it might? If you disable VSX and VMX, do the tests pass? That will at least tell us if it's an issue with vectorization or not.

KungFuJesus avatar Jun 09 '23 13:06 KungFuJesus

On develop:

  • On glibc ppc64 (big endian), they all pass.
  • On glibc ppc64le, I get many failures as well as one test which hangs:
Running tests...
Test project /root/zlib-ng/build
      Start  1: example
 1/68 Test  #1: example ..........................***Failed    0.00 sec
      Start  2: infcover
 2/68 Test  #2: infcover .........................   Passed    0.00 sec
      Start  3: CVE-2002-0059
 3/68 Test  #3: CVE-2002-0059 ....................   Passed    0.01 sec
      Start  4: CVE-2004-0797
 4/68 Test  #4: CVE-2004-0797 ....................   Passed    0.01 sec
      Start  5: CVE-2005-1849
 5/68 Test  #5: CVE-2005-1849 ....................   Passed    0.01 sec
      Start  6: CVE-2005-2096
 6/68 Test  #6: CVE-2005-2096 ....................   Passed    0.01 sec
      Start  7: CVE-2018-25032-fixed-level-6
 7/68 Test  #7: CVE-2018-25032-fixed-level-6 .....   Passed    0.04 sec
      Start  8: CVE-2018-25032-default-level-6
 8/68 Test  #8: CVE-2018-25032-default-level-6 ...***Failed    0.04 sec
      Start  9: CVE-2018-25032-fixed-level-1
 9/68 Test  #9: CVE-2018-25032-fixed-level-1 .....   Passed    0.04 sec
      Start 10: CVE-2018-25032-default-level-1
10/68 Test #10: CVE-2018-25032-default-level-1 ...***Failed    0.04 sec
      Start 11: CVE-2018-25032-fixed-level-2
11/68 Test #11: CVE-2018-25032-fixed-level-2 .....   Passed    0.04 sec
      Start 12: CVE-2018-25032-default-level-2
12/68 Test #12: CVE-2018-25032-default-level-2 ...***Failed    0.04 sec
      Start 13: minigzip-fireworks.jpg-R
13/68 Test #13: minigzip-fireworks.jpg-R .........   Passed    0.10 sec
      Start 14: minigzip-fireworks.jpg-h
14/68 Test #14: minigzip-fireworks.jpg-h .........   Passed    0.10 sec
      Start 15: minigzip-fireworks.jpg-T
15/68 Test #15: minigzip-fireworks.jpg-T .........   Passed    0.04 sec
      Start 16: minigzip-fireworks.jpg-0
16/68 Test #16: minigzip-fireworks.jpg-0 .........   Passed    0.11 sec
      Start 17: minigzip-fireworks.jpg-1
17/68 Test #17: minigzip-fireworks.jpg-1 .........***Failed    0.04 sec
      Start 18: minigzip-fireworks.jpg-2
18/68 Test #18: minigzip-fireworks.jpg-2 .........***Failed    0.04 sec
      Start 19: minigzip-fireworks.jpg-4
19/68 Test #19: minigzip-fireworks.jpg-4 .........***Failed    0.04 sec
      Start 20: minigzip-fireworks.jpg-5
20/68 Test #20: minigzip-fireworks.jpg-5 .........***Failed    0.04 sec
      Start 21: minigzip-fireworks.jpg-F
21/68 Test #21: minigzip-fireworks.jpg-F .........   Passed    0.10 sec
      Start 22: minigzip-fireworks.jpg-6
22/68 Test #22: minigzip-fireworks.jpg-6 .........***Failed    0.04 sec
      Start 23: minigzip-fireworks.jpg-9
23/68 Test #23: minigzip-fireworks.jpg-9 .........***Failed    0.04 sec
      Start 24: minigzip-fireworks.jpg-f
24/68 Test #24: minigzip-fireworks.jpg-f .........***Failed    0.04 sec
      Start 25: minigzip-lcet10.txt-R
25/68 Test #25: minigzip-lcet10.txt-R ............   Passed    0.15 sec
      Start 26: minigzip-lcet10.txt-h
26/68 Test #26: minigzip-lcet10.txt-h ............   Passed    0.14 sec
      Start 27: minigzip-lcet10.txt-T
27/68 Test #27: minigzip-lcet10.txt-T ............   Passed    0.05 sec
      Start 28: minigzip-lcet10.txt-0
28/68 Test #28: minigzip-lcet10.txt-0 ............   Passed    0.14 sec
      Start 29: minigzip-lcet10.txt-1
29/68 Test #29: minigzip-lcet10.txt-1 ............***Failed    0.05 sec
      Start 30: minigzip-lcet10.txt-2
30/68 Test #30: minigzip-lcet10.txt-2 ............***Failed    0.06 sec
      Start 31: minigzip-lcet10.txt-4
31/68 Test #31: minigzip-lcet10.txt-4 ............***Failed    0.06 sec
      Start 32: minigzip-lcet10.txt-5
32/68 Test #32: minigzip-lcet10.txt-5 ............***Failed    0.08 sec
      Start 33: minigzip-lcet10.txt-F
33/68 Test #33: minigzip-lcet10.txt-F ............***Failed    0.09 sec
      Start 34: minigzip-lcet10.txt-6
34/68 Test #34: minigzip-lcet10.txt-6 ............***Failed    0.09 sec
      Start 35: minigzip-lcet10.txt-9
35/68 Test #35: minigzip-lcet10.txt-9 ............***Failed    0.14 sec
      Start 36: minigzip-lcet10.txt-f
36/68 Test #36: minigzip-lcet10.txt-f ............***Failed    0.09 sec
      Start 37: minigzip-paper-100k.pdf-R
37/68 Test #37: minigzip-paper-100k.pdf-R ........   Passed    0.10 sec
      Start 38: minigzip-paper-100k.pdf-h
38/68 Test #38: minigzip-paper-100k.pdf-h ........   Passed    0.10 sec
      Start 39: minigzip-paper-100k.pdf-T
39/68 Test #39: minigzip-paper-100k.pdf-T ........   Passed    0.04 sec
      Start 40: minigzip-paper-100k.pdf-0
40/68 Test #40: minigzip-paper-100k.pdf-0 ........   Passed    0.09 sec
      Start 41: minigzip-paper-100k.pdf-1
41/68 Test #41: minigzip-paper-100k.pdf-1 ........***Failed    0.04 sec
      Start 42: minigzip-paper-100k.pdf-2
42/68 Test #42: minigzip-paper-100k.pdf-2 ........***Failed    0.03 sec
      Start 43: minigzip-paper-100k.pdf-4
43/68 Test #43: minigzip-paper-100k.pdf-4 ........***Failed    0.04 sec
      Start 44: minigzip-paper-100k.pdf-5
44/68 Test #44: minigzip-paper-100k.pdf-5 ........***Failed    0.04 sec
      Start 45: minigzip-paper-100k.pdf-F
45/68 Test #45: minigzip-paper-100k.pdf-F ........***Failed    0.04 sec
      Start 46: minigzip-paper-100k.pdf-6
46/68 Test #46: minigzip-paper-100k.pdf-6 ........***Failed    0.04 sec
      Start 47: minigzip-paper-100k.pdf-9
47/68 Test #47: minigzip-paper-100k.pdf-9 ........***Failed    0.04 sec
      Start 48: minigzip-paper-100k.pdf-f
48/68 Test #48: minigzip-paper-100k.pdf-f ........***Failed    0.04 sec
      Start 49: minigzip-detect-text-A
49/68 Test #49: minigzip-detect-text-A ...........***Failed    0.09 sec
      Start 50: minigzip-detect-binary-A
50/68 Test #50: minigzip-detect-binary-A .........***Failed    0.04 sec
      Start 51: GH-361
51/68 Test #51: GH-361 ...........................   Passed    0.09 sec
      Start 52: GH-364
52/68 Test #52: GH-364 ...........................   Passed    0.08 sec
      Start 53: GH-382
53/68 Test #53: GH-382 ...........................   Passed    0.04 sec
      Start 54: GH-536-segfault
54/68 Test #54: GH-536-segfault ..................***Failed    0.04 sec
      Start 55: GH-536-incomplete-read
55/68 Test #55: GH-536-incomplete-read ...........***Failed    0.05 sec
      Start 56: GH-536-zero-stored-block
56/68 Test #56: GH-536-zero-stored-block .........***Failed    0.04 sec
      Start 57: GH-751
57/68 Test #57: GH-751 ...........................***Failed    0.03 sec
      Start 58: minigzip-file_compress
58/68 Test #58: minigzip-file_compress ...........***Failed    0.02 sec
      Start 59: minideflate-file_compress
59/68 Test #59: minideflate-file_compress ........***Failed    0.02 sec
      Start 60: minigzip-help
60/68 Test #60: minigzip-help ....................   Passed    0.01 sec
      Start 61: minigzip-invalid
61/68 Test #61: minigzip-invalid .................   Passed    0.01 sec
      Start 62: minideflate-help
62/68 Test #62: minideflate-help .................   Passed    0.01 sec
      Start 63: minideflate-invalid
63/68 Test #63: minideflate-invalid ..............   Passed    0.01 sec
      Start 64: switchlevels-help
64/68 Test #64: switchlevels-help ................   Passed    0.01 sec
      Start 65: makecrct
65/68 Test #65: makecrct .........................   Passed    0.10 sec
      Start 66: makefixed
66/68 Test #66: makefixed ........................   Passed    0.03 sec
      Start 67: maketrees
67/68 Test #67: maketrees ........................   Passed    0.03 sec
      Start 68: gtest_zlib
  • On glibc ppc64le, if I set -DWITH_ALTIVEC=OFF -DWITH_POWER8=OFF -DWITH_POWER9=OFF, they all pass.
  • On glibc ppc64le, if I set -DWITH_ALTIVEC=ON -DWITH_POWER8=OFF -DWITH_POWER9=OFF, 3 fail:

96% tests passed, 3 tests failed out of 68

Total Test time (real) =   6.45 sec

The following tests FAILED:
          1 - example (Failed)
         59 - minideflate-file_compress (Failed)
         68 - gtest_zlib (Failed)

----

[  FAILED  ] 60 tests, listed below:
[  FAILED  ] deflate.params
[  FAILED  ] inflate.adler32
[  FAILED  ] adler32/adler32_variant.vmx/28, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 A0-6B AB-36 01-00 00-00 14-00 00-00 5F-06 5D-42>
[  FAILED  ] adler32/adler32_variant.vmx/30, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 D0-6B AB-36 01-00 00-00 14-00 00-00 50-06 29-42>
[  FAILED  ] adler32/adler32_variant.vmx/32, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 00-6C AB-36 01-00 00-00 14-00 00-00 09-06 77-3F>
[  FAILED  ] adler32/adler32_variant.vmx/34, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 30-6C AB-36 01-00 00-00 14-00 00-00 B7-06 AC-48>
[  FAILED  ] adler32/adler32_variant.vmx/36, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 60-6C AB-36 01-00 00-00 14-00 00-00 E6-06 A9-44>
[  FAILED  ] adler32/adler32_variant.vmx/38, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 90-6C AB-36 01-00 00-00 14-00 00-00 F9-06 77-4A>
[  FAILED  ] adler32/adler32_variant.vmx/40, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 C0-6C AB-36 01-00 00-00 14-00 00-00 E5-06 AE-48>
[  FAILED  ] adler32/adler32_variant.vmx/42, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 F0-6C AB-36 01-00 00-00 14-00 00-00 14-04 10-2B>
[  FAILED  ] adler32/adler32_variant.vmx/44, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 20-6D AB-36 01-00 00-00 14-00 00-00 23-04 45-2B>
[  FAILED  ] adler32/adler32_variant.vmx/46, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 50-6D AB-36 01-00 00-00 14-00 00-00 2B-04 C1-2B>
[  FAILED  ] adler32/adler32_variant.vmx/48, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 80-6D AB-36 01-00 00-00 14-00 00-00 1A-04 68-2B>
[  FAILED  ] adler32/adler32_variant.vmx/50, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 B0-6D AB-36 01-00 00-00 14-00 00-00 17-04 FA-2A>
[  FAILED  ] adler32/adler32_variant.vmx/52, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 E0-6D AB-36 01-00 00-00 14-00 00-00 20-04 8D-2B>
[  FAILED  ] adler32/adler32_variant.vmx/54, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 10-6E AB-36 01-00 00-00 14-00 00-00 3F-04 8E-2C>
[  FAILED  ] adler32/adler32_variant.vmx/56, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 40-6E AB-36 01-00 00-00 14-00 00-00 17-04 F1-2A>
[  FAILED  ] adler32/adler32_variant.vmx/57, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 58-6E AB-36 01-00 00-00 1E-00 00-00 41-08 9D-7C>
[  FAILED  ] adler32/adler32_variant.vmx/58, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 78-6E AB-36 01-00 00-00 1E-00 00-00 51-07 06-71>
[  FAILED  ] adler32/adler32_variant.vmx/59, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 98-6E AB-36 01-00 00-00 1E-00 00-00 0A-07 95-70>
[  FAILED  ] adler32/adler32_variant.vmx/60, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 B8-6E AB-36 01-00 00-00 1E-00 00-00 15-08 53-82>
[  FAILED  ] adler32/adler32_variant.vmx/61, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 D8-6E AB-36 01-00 00-00 1E-00 00-00 61-06 25-61>
[  FAILED  ] adler32/adler32_variant.vmx/62, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 F8-6E AB-36 01-00 00-00 1E-00 00-00 A3-06 20-64>
[  FAILED  ] adler32/adler32_variant.vmx/63, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 18-6F AB-36 01-00 00-00 1E-00 00-00 CB-06 42-67>
[  FAILED  ] adler32/adler32_variant.vmx/64, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 38-6F AB-36 01-00 00-00 1E-00 00-00 80-06 67-67>
[  FAILED  ] adler32/adler32_variant.vmx/65, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 58-6F AB-36 01-00 00-00 1E-00 00-00 0F-07 47-75>
[  FAILED  ] adler32/adler32_variant.vmx/66, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 78-6F AB-36 01-00 00-00 1E-00 00-00 EE-06 EA-69>
[  FAILED  ] adler32/adler32_variant.vmx/67, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 98-6F AB-36 01-00 00-00 64-00 00-00 92-1E B0-01>
[  FAILED  ] adler32/adler32_variant.vmx/68, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 00-70 AB-36 01-00 00-00 64-00 00-00 96-1E DB-FB>
[  FAILED  ] adler32/adler32_variant.vmx/69, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 68-70 AB-36 01-00 00-00 64-00 00-00 C8-1E A6-47>
[  FAILED  ] adler32/adler32_variant.vmx/70, where GetParam() = 24-byte object <01-00 00-00 00-00 00-00 80-75 AB-36 01-00 00-00 B0-15 00-00 8F-71 81-8B>
[  FAILED  ] adler32/adler32_variant.vmx/99, where GetParam() = 24-byte object <6A-AE 8F-50 00-00 00-00 A0-6B AB-36 01-00 00-00 14-00 00-00 C8-B4 F2-33>
[  FAILED  ] adler32/adler32_variant.vmx/101, where GetParam() = 24-byte object <40-6A 13-67 00-00 00-00 D0-6B AB-36 01-00 00-00 14-00 00-00 8F-70 A0-F6>
[  FAILED  ] adler32/adler32_variant.vmx/103, where GetParam() = 24-byte object <B5-84 0C-2E 00-00 00-00 00-6C AB-36 01-00 00-00 14-00 00-00 BD-8A 29-CC>
[  FAILED  ] adler32/adler32_variant.vmx/105, where GetParam() = 24-byte object <92-AA 53-F8 00-00 00-00 30-6C AB-36 01-00 00-00 14-00 00-00 48-B1 25-95>
[  FAILED  ] adler32/adler32_variant.vmx/107, where GetParam() = 24-byte object <9F-5B 27-03 00-00 00-00 60-6C AB-36 01-00 00-00 14-00 00-00 84-62 91-70>
[  FAILED  ] adler32/adler32_variant.vmx/109, where GetParam() = 24-byte object <62-BF C8-AF 00-00 00-00 90-6C AB-36 01-00 00-00 14-00 00-00 5A-C6 B4-EE>
[  FAILED  ] adler32/adler32_variant.vmx/111, where GetParam() = 24-byte object <14-B2 75-0E 00-00 00-00 C0-6C AB-36 01-00 00-00 14-00 00-00 F8-B8 71-41>
[  FAILED  ] adler32/adler32_variant.vmx/113, where GetParam() = 24-byte object <B1-A4 57-F8 00-00 00-00 F0-6C AB-36 01-00 00-00 14-00 00-00 C4-A8 F9-01>
[  FAILED  ] adler32/adler32_variant.vmx/115, where GetParam() = 24-byte object <16-56 AA-D6 00-00 00-00 20-6D AB-36 01-00 00-00 14-00 00-00 38-5A FC-BB>
[  FAILED  ] adler32/adler32_variant.vmx/117, where GetParam() = 24-byte object <85-23 E9-0B 00-00 00-00 50-6D AB-36 01-00 00-00 14-00 00-00 AF-27 18-FE>
[  FAILED  ] adler32/adler32_variant.vmx/119, where GetParam() = 24-byte object <C1-3B B1-3D 00-00 00-00 80-6D AB-36 01-00 00-00 14-00 00-00 DA-3F 64-14>
[  FAILED  ] adler32/adler32_variant.vmx/121, where GetParam() = 24-byte object <36-94 CD-F6 00-00 00-00 B0-6D AB-36 01-00 00-00 14-00 00-00 4C-98 9F-B6>
[  FAILED  ] adler32/adler32_variant.vmx/123, where GetParam() = 24-byte object <FC-70 57-07 00-00 00-00 E0-6D AB-36 01-00 00-00 14-00 00-00 1B-75 07-07>
[  FAILED  ] adler32/adler32_variant.vmx/125, where GetParam() = 24-byte object <75-29 13-C6 00-00 00-00 10-6E AB-36 01-00 00-00 14-00 00-00 B3-2D ED-2F>
[  FAILED  ] adler32/adler32_variant.vmx/127, where GetParam() = 24-byte object <AA-8C 3B-B6 00-00 00-00 40-6E AB-36 01-00 00-00 14-00 00-00 C0-90 05-DF>
[  FAILED  ] adler32/adler32_variant.vmx/128, where GetParam() = 24-byte object <B8-A2 45-8A 00-00 00-00 58-6E AB-36 01-00 00-00 1E-00 00-00 F8-AA 80-19>
[  FAILED  ] adler32/adler32_variant.vmx/129, where GetParam() = 24-byte object <78-5B E9-CB 00-00 00-00 78-6E AB-36 01-00 00-00 1E-00 00-00 C8-62 86-F5>
[  FAILED  ] adler32/adler32_variant.vmx/130, where GetParam() = 24-byte object <4B-A5 F8-4E 00-00 00-00 98-6E AB-36 01-00 00-00 1E-00 00-00 54-AC 65-1F>
[  FAILED  ] adler32/adler32_variant.vmx/131, where GetParam() = 24-byte object <7A-26 AD-76 00-00 00-00 B8-6E AB-36 01-00 00-00 1E-00 00-00 8E-2E 79-7B>
[  FAILED  ] adler32/adler32_variant.vmx/132, where GetParam() = 24-byte object <3C-61 9E-56 00-00 00-00 D8-6E AB-36 01-00 00-00 1E-00 00-00 9C-67 61-1D>
[  FAILED  ] adler32/adler32_variant.vmx/133, where GetParam() = 24-byte object <DA-61 AA-36 00-00 00-00 F8-6E AB-36 01-00 00-00 1E-00 00-00 7C-68 EC-12>
[  FAILED  ] adler32/adler32_variant.vmx/134, where GetParam() = 24-byte object <DF-22 72-F6 00-00 00-00 18-6F AB-36 01-00 00-00 1E-00 00-00 A9-29 03-74>
[  FAILED  ] adler32/adler32_variant.vmx/135, where GetParam() = 24-byte object <D3-4F B3-74 00-00 00-00 38-6F AB-36 01-00 00-00 1E-00 00-00 52-56 4C-37>
[  FAILED  ] adler32/adler32_variant.vmx/136, where GetParam() = 24-byte object <70-D7 1F-35 00-00 00-00 58-6F AB-36 01-00 00-00 1E-00 00-00 7E-DE DF-EA>
[  FAILED  ] adler32/adler32_variant.vmx/137, where GetParam() = 24-byte object <77-EF 5A-C4 00-00 00-00 78-6F AB-36 01-00 00-00 1E-00 00-00 64-F6 CB-3F>
[  FAILED  ] adler32/adler32_variant.vmx/138, where GetParam() = 24-byte object <71-EA 34-D0 00-00 00-00 98-6F AB-36 01-00 00-00 64-00 00-00 11-09 08-6B>
[  FAILED  ] adler32/adler32_variant.vmx/139, where GetParam() = 24-byte object <DE-C0 AD-DE 00-00 00-00 00-70 AB-36 01-00 00-00 64-00 00-00 73-DF 5F-35>
[  FAILED  ] adler32/adler32_variant.vmx/140, where GetParam() = 24-byte object <11-BA 5E-BA 00-00 00-00 68-70 AB-36 01-00 00-00 64-00 00-00 D8-D8 8B-B4>
[  FAILED  ] adler32/adler32_variant.vmx/141, where GetParam() = 24-byte object <45-AA 12-77 00-00 00-00 80-75 AB-36 01-00 00-00 B0-15 00-00 E2-1B C5-7D>
  • On glibc ppc64le, if I set -DWITH_ALTIVEC=OFF -DWITH_POWER8=ON -DWITH_POWER9=OFF, all pass.
  • On glibc ppc64le if I set -DWITH_ALTIVEC=OFF -DWITH_POWER8=OFF -DWITH_POWER9=ON, I get a bunch of failures again (like the first case) and a hang.

thesamesam avatar Jun 09 '23 13:06 thesamesam

-DWITH_ALTIVEC=OFF -DWITH_POWER8=ON -DWITH_POWER9=OFF

this fixes them

nekopsykose avatar Jun 09 '23 13:06 nekopsykose

-DWITH_ALTIVEC=OFF -DWITH_POWER8=OFF -DWITH_POWER9=ON

this breaks them again like originally

so, i guess it's not altivec, but either something power9 specific, or gcc is broken for power9

the gcc in this case is 13.1 from the 13-20230603 snapshot archive (+ usual distro patches but those don't affect ppc here)

nekopsykose avatar Jun 09 '23 13:06 nekopsykose

fails with 12.2 based gcc the same way

nekopsykose avatar Jun 09 '23 13:06 nekopsykose

I went back to gcc 9 before everything started to work...

mtl1979 avatar Jun 09 '23 13:06 mtl1979

it could be either some UB in any ppc-specific code that gets hit by the optimiser in newer gccs only, or gcc itself above some version broke for some ppc stuff, but in my experience "old gcc starts working" is usually the former. of course, nothing is certain yet :)

nekopsykose avatar Jun 09 '23 13:06 nekopsykose

I know gcc removed support for Power8, and we need that for zlib-ng...

mtl1979 avatar Jun 09 '23 13:06 mtl1979

Are you sure they did?

thesamesam avatar Jun 09 '23 13:06 thesamesam

I checked the build parameters for the compilers that I used for testing...

mtl1979 avatar Jun 09 '23 13:06 mtl1979

We've still got active POWER8 users in Gentoo w/ 13 and I still see it at https://gcc.gnu.org/onlinedocs/gcc/RS_002f6000-and-PowerPC-Options.html#index-mcpu-10.

thesamesam avatar Jun 09 '23 13:06 thesamesam

For a long time, forcing to target Power8 has been same as targeting Power9... I tried forcing Power8 but it crashed, because I don't have Power9...

mtl1979 avatar Jun 09 '23 13:06 mtl1979

Yes, I think the VMX issue might be my fault, I didn't have a little endian system to test it with. The adler checksum actually doesn't care about the order of the adds but something is probably happening with the mixing matrix at the end. The VSX/POWER8 implementations on the other hand were already there. Might want to rope in @mscastanho.

KungFuJesus avatar Jun 09 '23 15:06 KungFuJesus

Yeah I'm almost certain this is what we're seeing: https://gcc.gcc.gnu.narkive.com/cJndcMpR/vec-ld-versus-vec-vsx-ld-on-power8

I'm fairly confident we can fix this by conditionally adjusting the mixing matrix based on endianness at compile time. It'd be helpful to have a little endian power system with VSX to test this on, though. The other option is to permute every load, and supposedly the compiler can eliminate these permutes by some keyhole optimization that rearranges the other operand for you, but I don't know that I'd want to rely on that.

Also give then fact that we're only seeing the POWER7 unaligned load in the POWER8 version, something tells me this never worked for PPC64LE.

Though, the tests passing for POWER8 tells me it might just be a minor detail in the VMX implementation that makes it little endian unsafe rather than an absent byteswap. Either that or the vec_xl intrinsic is doing more than I initially thought.

KungFuJesus avatar Jun 09 '23 15:06 KungFuJesus

@mtl1979 found a fix for adler_vmx, that is a separate issue from this one. He's investigating the POWER9 issue now.

Looks like #1518 fixes both of these issues.

KungFuJesus avatar Jun 09 '23 18:06 KungFuJesus

applying https://github.com/zlib-ng/zlib-ng/commit/e5ab5890a89860a7f4115fca9b0fc7c778b26ef1 to 2.1.2 still seems to have the same failures

nekopsykose avatar Jun 11 '23 20:06 nekopsykose

The adler32 checksum should be fixed at least for Altivec. Which compiler are you using? The POWER9 fix tests the GCC version and applies a manual inline asm call if it's less than 13 to the vectorized count leading and trailing zeros intrinsic. Perhaps our fix is relying on an endianness convention that GCC recently fixed? That, or you're using clang and clang also got it wrong. The comment for the wrapper says we're working around this bug: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=3f30f2d1dbb3228b8468b26239fe60c2974ce2ac

KungFuJesus avatar Jun 11 '23 20:06 KungFuJesus

psykose and I are both using latest 13s from the release branch, so I'd expect us to be fine wrt any bugs

thesamesam avatar Jun 11 '23 20:06 thesamesam

If you change that macro to always be true does that fix things? If so we may be missing an adjustment for endianness when using the intrinsic.

KungFuJesus avatar Jun 11 '23 21:06 KungFuJesus

Hm, everything passes for me w/ fresh 12 (not tried 13) on glibc ppc64le on develop..

thesamesam avatar Jun 11 '23 21:06 thesamesam

My guess, and I don't have power9 systems to verify this but, gcc's fix for endianness inverted the need to switch between trailing and leading zero. Perhaps gcc flips which instruction gets used based on endianness?

@mtl1979 maybe see what instruction gcc 12 is generating?

KungFuJesus avatar Jun 11 '23 21:06 KungFuJesus

@KungFuJesus I used inline assembly for gcc 13 too and it passed all tests... Seems like someone forgot to apply the patch to gcc 13 series...

mtl1979 avatar Jun 11 '23 22:06 mtl1979

Either that or that fix for gcc for endianness was actually swapping the emitted instruction based on endianness.

KungFuJesus avatar Jun 11 '23 22:06 KungFuJesus

Either that or that fix for gcc for endianness was actually swapping the emitted instruction based on endianness.

I'm pretty sure the fix does swap the emitted instruction based on endianess.

However, I really want to keep the workaround simple and in the end we might need to use inline assembly always...

mtl1979 avatar Jun 11 '23 22:06 mtl1979