hdmf icon indicating copy to clipboard operation
hdmf copied to clipboard

[Bug]: When data type is unspecified, prevent compound dtypes

Open rly opened this issue 8 months ago • 0 comments

What happened?

Follow-up to https://github.com/hdmf-dev/hdmf-zarr/issues/273

As @oruebel wrote:

While TimeSeries does not restrict the data type, it assume the use of a basic numeric data type (int, float) or string. Using a compound data type with a TimeSeries without defining an extension is currently not supported.

Our current docval-based methods for validation does not validate the dtype of arrays, and so we cannot validate on instantiation of the TimeSeries. This year, we plan to replace docval, so I suggest we punt the ability to prevent a TimeSeries with a structured array as the data from being instantiated to that time.

As mentioned in the linked issue, the HDF5 backend raises an unclear error on write, and the Zarr backend does not raise an error at all.

An error should be raised before the backends are involved, probably during the convert_dtype step of the build process, where the user data is converted to the dtype specified in the spec. If no dtype is specified in the spec, then we should check that the data is of a basic numeric data type or string. This behavior should also be documented in the spec language.

Steps to Reproduce

See linked issue

Traceback


Operating System

Windows

Python Version

3.13

Package Versions

No response

rly avatar Apr 30 '25 23:04 rly