zarr-python
zarr-python copied to clipboard
test_format_compatibility fails on big-endian systems
Problem description
When running on a big-endian system, test_format_compatibility
fails due to an endian mismatch.
__________________________ test_format_compatibility ___________________________
def test_format_compatibility():
# This test is intended to catch any unintended changes that break the ability to
# read data stored with a previous minor version (which should be format-compatible).
# fixture data
fixture = group(store=DirectoryStore('fixture'))
# set seed to get consistent random data
np.random.seed(42)
arrays_chunks = [
(np.arange(1111, dtype='<i1'), 100),
(np.arange(1111, dtype='<i2'), 100),
(np.arange(1111, dtype='<i4'), 100),
(np.arange(1111, dtype='<i8'), 1000),
(np.random.randint(0, 200, size=2222, dtype='u1').astype('<u1'), 100),
(np.random.randint(0, 2000, size=2222, dtype='u2').astype('<u2'), 100),
(np.random.randint(0, 2000, size=2222, dtype='u4').astype('<u4'), 100),
(np.random.randint(0, 2000, size=2222, dtype='u8').astype('<u8'), 100),
(np.linspace(0, 1, 3333, dtype='<f2'), 100),
(np.linspace(0, 1, 3333, dtype='<f4'), 100),
(np.linspace(0, 1, 3333, dtype='<f8'), 100),
(np.random.normal(loc=0, scale=1, size=4444).astype('<f2'), 100),
(np.random.normal(loc=0, scale=1, size=4444).astype('<f4'), 100),
(np.random.normal(loc=0, scale=1, size=4444).astype('<f8'), 100),
(np.random.choice([b'A', b'C', b'G', b'T'],
size=5555, replace=True).astype('S'), 100),
(np.random.choice(['foo', 'bar', 'baz', 'quux'],
size=5555, replace=True).astype('<U'), 100),
(np.random.choice([0, 1/3, 1/7, 1/9, np.nan],
size=5555, replace=True).astype('<f8'), 100),
(np.random.randint(0, 2, size=5555, dtype=bool), 100),
(np.arange(20000, dtype='<i4').reshape(2000, 10, order='C'), (100, 3)),
(np.arange(20000, dtype='<i4').reshape(200, 100, order='F'), (100, 30)),
(np.arange(20000, dtype='<i4').reshape(200, 10, 10, order='C'), (100, 3, 3)),
(np.arange(20000, dtype='<i4').reshape(20, 100, 10, order='F'), (10, 30, 3)),
(np.arange(20000, dtype='<i4').reshape(20, 10, 10, 10, order='C'), (10, 3, 3, 3)),
(np.arange(20000, dtype='<i4').reshape(20, 10, 10, 10, order='F'), (10, 3, 3, 3)),
]
compressors = [
None,
Zlib(level=1),
BZ2(level=1),
Blosc(cname='zstd', clevel=1, shuffle=0),
Blosc(cname='zstd', clevel=1, shuffle=1),
Blosc(cname='zstd', clevel=1, shuffle=2),
Blosc(cname='lz4', clevel=1, shuffle=0),
]
for i, (arr, chunks) in enumerate(arrays_chunks):
if arr.flags.f_contiguous:
order = 'F'
else:
order = 'C'
for j, compressor in enumerate(compressors):
path = '{}/{}'.format(i, j)
if path not in fixture: # pragma: no cover
# store the data - should be one-time operation
fixture.array(path, data=arr, chunks=chunks, order=order,
compressor=compressor)
# setup array
z = fixture[path]
# check contents
if arr.dtype.kind == 'f':
assert_array_almost_equal(arr, z[:])
else:
assert_array_equal(arr, z[:])
# check dtype
> assert arr.dtype == z.dtype
E AssertionError: assert dtype('>U4') == dtype('<U4')
E + where dtype('>U4') = array(['foo', 'quux', 'quux', ..., 'bar', 'bar', 'quux'], dtype='>U4').dtype
E + and dtype('<U4') = <zarr.core.Array '/15/0' (5555,) <U4>.dtype
zarr/tests/test_storage.py:2052: AssertionError
Version and installation information
Please provide the following:
- Value of
zarr.__version__
: 2.10.1 - Value of
numcodecs.__version__
: 0.9.1 - Version of Python interpreter: 3.9.7
- Operating system: Fedora 34
- How Zarr was installed (e.g., "using pip into virtual environment", or "using conda"): from source
Thanks, @QuLogic. I don't assume you've had any insights into what's going on?
Based on https://github.com/actions/virtual-environments/issues/2187 I assume it will be at earliest next year for GHA support for a big-endian system. Shall we try to temporarily re-enable another platform like CircleCI?
See #869 for an attempt to use a qemu docker image.
Unfortunately these tests passed in https://github.com/zarr-developers/zarr-python/runs/4214212844?check_suite_focus=true :/