qoi-rust icon indicating copy to clipboard operation
qoi-rust copied to clipboard

Two tests fail on s390x

Open kalev opened this issue 1 year ago • 9 comments

I am packaging qoi-rust for Fedora and as part of that, we try to run the test suite on all architectures that are available in the build system. On s390x, two tests fail:

failures:
---- test_encode_index_3ch stdout ----
thread 'test_encode_index_3ch' panicked at 'assertion failed: `(left == right)`
  left: `[254, 101, 102, 103, 254, 1, 2, 3, 254]`,
 right: `[254, 101, 102, 103, 254, 1, 2, 3, 51]`', tests/test_chunks.rs:26:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
---- test_encode_index_4ch stdout ----
thread 'test_encode_index_4ch' panicked at 'assertion failed: `(left == right)`
  left: `[255, 101, 102, 103, 104, 255, 1, 2, 3, 4, 56]`,
 right: `[255, 101, 102, 103, 104, 255, 1, 2, 3, 4, 54]`', tests/test_chunks.rs:26:5
failures:
    test_encode_index_3ch
    test_encode_index_4ch

Possibly some endianness issue as it is the only big endian architecture that we have?

kalev avatar Aug 23 '23 11:08 kalev

Hi, at first glance, I suspect the hash function used in the tests does not match the hash function in the crate itself. The spec encodes everything in LE from my understanding.

I'll need to inspect further. I don't have a big endian machine to test on unfortunately.

AregevDev avatar Oct 10 '23 07:10 AregevDev

Thanks! I don't have big endian machine to offer either, sorry - all I have is a way to kick off builds in the Fedora build system.

kalev avatar Oct 10 '23 08:10 kalev

I ran the two functions side by side on my Windows machine (LE) and they seem fine. I will download qemu and try to set up a BE environment to further test things.

I don't know if this repo is still maintained.

AregevDev avatar Oct 10 '23 10:10 AregevDev

I don't know if this repo is still maintained.

Maintained but not actively so. If there's clear issues though, we can investigate and fix.

aldanor avatar Oct 10 '23 12:10 aldanor

Ah, thank you for your response. I'll investigate the issue further and update here. LMK if you have any idea where things go wrong.

AregevDev avatar Oct 10 '23 13:10 AregevDev

Yes, my suspicion was correct. Pixel::hash_index is broken under BE. I was able to confirm it with miri targeting the s390x arch I don't really understand why it behaves like that as my bitwise skills are lacking. However, I came up with an iterator solution to calculate the hash:

fn hash_iter(px: [u8; 4]) -> u8 {
    px.into_iter()
        .zip([0x03u8, 0x05, 0x07, 0x0b].into_iter())
        .map(|p| p.0.wrapping_mul(p.1))
        .fold(0u8, |acc, x| acc.wrapping_add(x)) & 63
}

This code works as intended on both LE and BE.

AregevDev avatar Oct 10 '23 21:10 AregevDev

My code seems to be slower than the bitwise solution:

PS C:\Users\arege\Documents\Programming\hash_bench> cargo bench
    Finished bench [optimized] target(s) in 0.16s
     Running unittests src\lib.rs (target\release\deps\hash_bench-3a352fd57ea2ef9e.exe)
hash/hash_iter/2        time:   [1.6037 ns 1.6129 ns 1.6227 ns]
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  4 (4.00%) high severe
hash/hash_bitwise/3     time:   [1.3880 ns 1.3946 ns 1.4017 ns]
Found 14 outliers among 100 measurements (14.00%)
  3 (3.00%) low mild
  4 (4.00%) high mild
  7 (7.00%) high severe

AregevDev avatar Oct 10 '23 22:10 AregevDev

With some help, I was able to fix the bitwise code to be cross-platform. I ran all the tests on both LE and BE (miri) systems. I'll open a PR. @kalev please clone my fork and run it on your buildsystem, to see if it fixes the issue.

AregevDev avatar Oct 10 '23 23:10 AregevDev

I can confirm that this makes the two tests pass on s390x. Thanks a lot, @AregevDev!

kalev avatar Oct 11 '23 10:10 kalev