zarr-python icon indicating copy to clipboard operation
zarr-python copied to clipboard

Arrays with padded structured datatypes fail

Open jrs65 opened this issue 2 years ago • 0 comments

Zarr version

v2.16.2

Numcodecs version

v0.12.1

Python Version

3.11

Operating System

Linux

Installation

Using pip into virtualenv

Description

Using a structured datatype with padding in array creation causes writes into that array to throw a TypeError for an unsafe cast. This seems to be because zarr adds an extra field into the datatype to account for the padding and this causes the assignment to fail as the numpy array being assigned has two fields, but the zarr array being assigned to has an extra one.

I believe the cause of this is in zarr.meta.Meta2.encode_dtype where for a structured datatype it uses numpy's description of the type as the datatype definition (np.dtype.descr) , which creates an anonymous field for the padding, but round tripping this back to a concrete dtype gives it a name (in the example below doing np.dtype(a.dtype.descr)).

I think either zarr should support this case, or if not, it should warn/throw a specific exception at array creation time.

Steps to reproduce

import numpy as np
import zarr

# Create a numpy datatype with two fields with extra padding to align to 8 bytes total
# Although this is a bit contrived it's actually what I have from a bunch of HDF5 files
dt = np.dtype({'names': ['a', 'b'], 'formats': ['<u4', 'u1'], 'offsets': [0, 4], 'itemsize': 8})

# Create a numpy array with this datatype (this works fine)
a = np.zeros((10,), dtype=dt)

# Create a zarr array with the exact same datatype
z = zarr.zeros(shape=(10,), chunks=(10,), dtype=dt)

# This datatype is as expected
print(f"{a.dtype=}")

# This has gained an extra field for the padding: dtype([('a', '<u4'), ('b', 'u1'), ('f2', 'V3')])
print(f"{z.dtype=}")

# Note that this is basically the output of
print(f"{a.dtype.descr=}")

# This fails with a TypeError for an unsafe cast
z[:] = a

Additional output

No response

jrs65 avatar Jan 10 '24 11:01 jrs65