HighFive icon indicating copy to clipboard operation
HighFive copied to clipboard

Unable to create a compound datatype with variable length byte arrays

Open kochhalm opened this issue 4 years ago • 5 comments

Hi.

Please excuse me for posting here, since I couldnt find any support for BlueBrain/HighFive on StackOverflow.

I am trying to create a compound datatype with variable length arrays. When created and tested to work in hypy, the format looks like this using h5dump:

HDF5 "camera_data.h5py" {
GROUP "/" {
   GROUP "observations" {
      DATASET "0" {
         DATATYPE  H5T_COMPOUND {
            H5T_IEEE_F64LE "timestamp";
            H5T_ARRAY { [1] H5T_VLEN { H5T_STD_U8LE} } "bgr";
            H5T_ARRAY { [1] H5T_VLEN { H5T_STD_U8LE} } "d";
         }
         DATASPACE  SIMPLE { ( 39000 ) / ( 39000 ) }
         DATA {
         (0): {
               1.60865e+09,
               [ (255, 216, 255, 224, 0, 16, 74, 70, 73, 70, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 255, 219, 0, 67, 0, 2, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 2, 2, 2, 2, 4, 3, 2, 2, 2, 2, 5, 4, 4, 3, 4, 6, 5, 6, 6, 6, 5, 6, 6, 6, 7, 9, 8, 6, 7, 9, 7, 6, 6, 8, 11, 8, 9, 10, 10, 10, 10, 10, 6, 8, 11, 12, 11, 10, 12, 9, 10, 10, 10, 255, 219, 0, 67, 1, 2, 2, 2, 2, 2, 2, 5, 3, 3, 5, 10, 7, 6, 7, 
10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 255, 192, 0, 17, 8, 1, 224, 3, 80, 3, 1, 34, 0, 2, 17, 1, 3, 17, 1, 255, 196, 0, 31, 0, 0, 1, 5, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 255, 1
96, 0, 181, 16, 0, 2, 1, 3, 3, 2, 4, 3, 5, 5, 4, 4, 0, 0, 1, 125, 1, 2, 3, 0, 4, 17, 5, 18, 33, 49, 65, 6, 19, 81, 97, 7, 34, 113, 20, 50, 129, 145, 161, 8, 35, 66, 177, 193, 21, 82, 209, 240, 36, 51, 98, 114, 130, 9, 10, 22, 23, 24, 25, 26, 37, 38, 39, 40, 41, 42, 52, 53, 54, 55, 56, 57, 58, 67, 68, 69, 70, 71, 72, 73, 74, 83, 84, 85, 86, 87, 88, 89, 90, 99, 100, 101, 102, 103
, 104, 105, 106, 115, 116, 117,

Can you please tell me how to create such a compound datatype using the HighFive C++ APIs. Couldnt find an example to repurpose, so I am asking here:

DATATYPE  H5T_COMPOUND {
    H5T_IEEE_F64LE "timestamp";
    H5T_ARRAY { [1] H5T_VLEN { H5T_STD_U8LE} } "bgr";
    H5T_ARRAY { [1] H5T_VLEN { H5T_STD_U8LE} } "d";
}

Thanks so much.

Regards, Manish

kochhalm avatar Jan 18 '21 03:01 kochhalm

Any updates?

kochhalm avatar Jan 21 '21 00:01 kochhalm

Are variable length strings/byte arrays supported in compound types?

mpb27 avatar Mar 30 '21 06:03 mpb27

@alkino @ferdonline : I think you looked at this recently? If so, could you answer here?

pramodk avatar Apr 09 '21 18:04 pramodk

I've looked at how this is done in hdf5-rust and they have a custom vector defined that matches the HDF5 vlen (Definition)

#[repr(C)]
pub struct VarLenArray<T: Copy> {
    len: usize,
    ptr: *const T,
    tag: PhantomData<T>,
}

https://github.com/aldanor/hdf5-rust/blob/7c737df88af4791e59124c0d956cbfa6ebdb8779/hdf5-types/src/array.rs#L8-L13

Then they modify the transfer property on a H5Dread to use custom allocators using H5Pset_vlen_mem_manager.

https://github.com/aldanor/hdf5-rust/blob/63bb0b17baf10658914461a45b20775b2e32828d/src/hl/container.rs#L52-L57 https://github.com/aldanor/hdf5-rust/blob/63bb0b17baf10658914461a45b20775b2e32828d/src/hl/plist.rs#L216-L231

I believe that allows them to use free() in the destructor and return the vector without having to worry about a call back to the HDF5 library to reclaim the memory.

mpb27 avatar Apr 09 '21 20:04 mpb27

Except for string, variable type is not supported. H5T_ARRAY is not supported neither, see #659

alkino avatar Jan 23 '23 17:01 alkino