libSplash icon indicating copy to clipboard operation
libSplash copied to clipboard

Allow Append After Write: Expose max_dims

Open s0vereign opened this issue 8 years ago • 3 comments

I'm trying testwise to append a value to a Dataset written by SerialDataCollector using the following:

...

// Create necessary attributes 
   splash::SerialDataCollector HDFile(1);
   splash::DataCollector::FileCreationAttr fa;
    splash::DataCollector::initFileCreationAttr(fa);
    fa.fileAccType = splash::DataCollector::FAT_CREATE;
//  This is taken from a class therefore filename is known
    HDFile.open(filename.c_str(), fa);
    splash::ColTypeDouble ctDouble;

// Create and fill data Vector
    std::vector<double> p_sav(7);
 ...
// Prepare writing object
    splash::ColTypeDouble ctdouble;
    splash::Dimensions local(7,1,1);
    splash::Selection sel(local);

//  Write Operation
    HDFile.write(1, ctdouble,1, sel,"param_data",p_sav.data());

   // Prepare Data to append
   std::vector<double> p2(1);
   p2[0] = param->getne();

   // Append data
    HDFile.append(1, ctdouble, 1,"param_data", p2.data());
    HDFile.close();
...

Using this gives me the following error:

terminate called after throwing an instance of 'splash::DCException'
  what():  Exception for DCDataSet [param_data] append: Failed to extend dataset

The writing worked fine, but the appending doesn't, did I just misinterpret the docs or could this be a bug?

Already thanks in advance ;)

s0vereign avatar May 10 '16 14:05 s0vereign

Thanks for the issue!

Looks a bit like a bug / not-implemented to me. As a work-around you can replace your initial write call with

HDFile.append(1, ctdouble, 7, "param_data", p_sav.data());

to create the data set.

The reason might be that we allocate for performance reasons a simple extent and limited size dataspace in hdf5 with fixed size in regular write calls but appending needs H5F_UNLIMITED extents. To allow you to append to a previously write-created data-set we would need to expose you the extensible option in the write API or we would need the reserve API from the ParallelDataCollector for serial data sets which would be the solution I would prefer due to performance reasons (but is less flexible).

That said, be aware of the limitations of append: is usually not a great operation performance wise and works only 1d data sets as described in the manual. Better accumulate more data in RAM before writing or write a new data set if that is possible in your case.

CCing @f-schmitt-zih

ax3l avatar May 10 '16 18:05 ax3l

Okay thank you very much, it's no problem at all the workaround does the job as well! I just tried it because having one dataset for all iterations is a bit more convenient than one per iteration( for smaller stuff at least, in which IO performance is not critical).

s0vereign avatar May 11 '16 12:05 s0vereign

Ok, glad that helps!

I will leave it open for future improvements but I guess that would need an API change in the write API.

ax3l avatar May 11 '16 16:05 ax3l