gdl icon indicating copy to clipboard operation
gdl copied to clipboard

H5D_read has problems with variable-length strings

Open klimpel opened this issue 2 years ago • 2 comments

The handling of variable-length strings is not yet implemented in hdf5_unified_read For the case that the dataset has just a single element, I implemented it in my local version:

     } else if (ourType == GDL_STRING) {
 
-      if (debug) printf("fixed-length string dataset\n");
+      bool isVarLenStr = H5Tis_variable_str(elem_dtype) > 0;
+      if (debug) printf(isVarLenStr ? "variable-length string dataset\n" : "fixed-length string dataset\n");
 
       // string length (terminator included)
       SizeT str_len = H5Tget_size(elem_dtype);
 
       // total number of array elements
       SizeT num_elems=1;
       for(int i=0; i<rank_s; i++) num_elems *= count_s[i];
 
+      if (num_elems == 1 && isVarLenStr) {
+        char* raw = nullptr;
+        hdf5_basic_read( loc_id, datatype, ms_id, fs_id, &raw, e );
+
+        // create GDL variable
+        res = new DStringGDL(raw);
+
+        H5Dvlen_reclaim (ms_id, fs_id, H5P_DEFAULT, &raw);
+
+        return res;
+      }
       // allocate & read raw buffer
       char* raw = (char*) malloc(num_elems*str_len*sizeof(char));

Implementation of the array case is probably not difficult either, but I am neither familiar enough with DStringGDL, nor do I have a suitable .hdf5 file ready with which I could test that scenario.

klimpel avatar Feb 06 '24 20:02 klimpel

I took the liberty to assign this to @ogressel who has largely improved this code recently.

GillesDuvert avatar Feb 08 '24 16:02 GillesDuvert

Thanks, @klimpel . I will have a look, if I find some time. But it's been long enough that I need to re-familiarize myself with the Array-specific code.

ogressel avatar Feb 09 '24 09:02 ogressel