libmysofa icon indicating copy to clipboard operation
libmysofa copied to clipboard

HDF5: Specification lacks section on how data types are stored.

Open hoene opened this issue 5 years ago • 1 comments

In the HDF5 spec "IV.A.2.m. The Attribute Message" fields "Data (variable size)":

I have not found any section in the specification on how the data is actually stored.

I had to reengineer the entire code. Please refer to https://github.com/hoene/libmysofa/blob/5c238dc820f16a21b4dce7e91df98431852146e7/src/hdf/dataobject.c#L661 till line 769

Please clarify.

FYI @gheber

hoene avatar Apr 13 '20 08:04 hoene

We should have a call on this, but here's the gist. Since HDF5 1.8, attributes can be stored either in compact or dense form. The difference is that for compactly stored attributes there is a size limit on the attribute value (the "Data" field), which is the same as that for datasets with compact layout (slightly under 64 KB). The values of densely stored attributes can be of arbitrary size and are stored in a fractal heap and NOT in the "Data" field of the attribute message. For a densely stored attribute, there would be an Attribute Info Message, see "IV.A.2.v. The Attribute Info Message."

In any event, the binary representation (bytes on disk, encoding) is exactly the same as for dataset elements and values. That's why it's not repeated for attributes. Both HDF5 datasets and attributes are what one might call "HDF5 array variables." The difference between them is functional. Send me your availability and we'll schedule a call to talk this over!

gheber avatar Apr 14 '20 00:04 gheber