Arrays with padded structured datatypes fail
Zarr version
v2.16.2
Numcodecs version
v0.12.1
Python Version
3.11
Operating System
Linux
Installation
Using pip into virtualenv
Description
Using a structured datatype with padding in array creation causes writes into that array to throw a TypeError for an unsafe cast. This seems to be because zarr adds an extra field into the datatype to account for the padding and this causes the assignment to fail as the numpy array being assigned has two fields, but the zarr array being assigned to has an extra one.
I believe the cause of this is in zarr.meta.Meta2.encode_dtype where for a structured datatype it uses numpy's description of the type as the datatype definition (np.dtype.descr) , which creates an anonymous field for the padding, but round tripping this back to a concrete dtype gives it a name (in the example below doing np.dtype(a.dtype.descr)).
I think either zarr should support this case, or if not, it should warn/throw a specific exception at array creation time.
Steps to reproduce
import numpy as np
import zarr
# Create a numpy datatype with two fields with extra padding to align to 8 bytes total
# Although this is a bit contrived it's actually what I have from a bunch of HDF5 files
dt = np.dtype({'names': ['a', 'b'], 'formats': ['<u4', 'u1'], 'offsets': [0, 4], 'itemsize': 8})
# Create a numpy array with this datatype (this works fine)
a = np.zeros((10,), dtype=dt)
# Create a zarr array with the exact same datatype
z = zarr.zeros(shape=(10,), chunks=(10,), dtype=dt)
# This datatype is as expected
print(f"{a.dtype=}")
# This has gained an extra field for the padding: dtype([('a', '<u4'), ('b', 'u1'), ('f2', 'V3')])
print(f"{z.dtype=}")
# Note that this is basically the output of
print(f"{a.dtype.descr=}")
# This fails with a TypeError for an unsafe cast
z[:] = a
Additional output
No response