HighFive
HighFive copied to clipboard
Overwriting a dataSet using createDataSet
I am using HighFive to write the output and restart files of my simulations. For steady-state problems, the intermediate solutions are written at some given time intervals so that the simulation can be resumed if there are any interruptions during the run. However, I want to keep only one intermediate solution at a time and get rid of the older ones. There might be already a similar solution to this problem that I have failed to find but in case there isn't, it would be great if the function createDataSet
can take an additional boolean to overwrite a specific dataset if it already exists without touching other dataSets.
It is not implemented yet, we should design an API for that.
If you want to give a try, feel free, we can help you once you make a PR.
I think that H5Easy can already do this. Note though, unless that dataset was allocated extendable the data has to have the exact shape shape.
https://github.com/BlueBrain/HighFive/blob/417e4ff003dfe35f22c2352bce9d53a4fcac99ca/include/highfive/H5Easy.hpp#L76-L79
If you are considering a PR: It would make sense to consider the same API from createDataSet
.
Note though, unless that dataset was allocated extendable the data has to have the exact shape shape.
This is smart and should be kept for any implementation. Otherwise, I'd be concerned about growing files unintentionally.
I think that there is even no otherwise ;). HDF5 cannot delete data as it cannot reorder like your memory can. So
- Overwriting the exact bits is fine.
- Datasets allocated as chunked can grow (not shrink, or at least not really, it will not release bits, but might reuse them if you re-expand).
So if you want to overwrite an arbitrary dataset you'd have to release the link between the name and the data (making them deadweight) and adding new data. Note that if that happens you have to repack (i.e. write a new HDF5 file) to get rid of the deadweight.
Should also match the datatype, no?
Yes!