h5wasm icon indicating copy to clipboard operation
h5wasm copied to clipboard

CRUD operations

Open carlogambi opened this issue 2 years ago • 2 comments

Hi all, I’m trying to do CRUD OPS in order to manage and populate a dataset: writing and reading on it are straightforwared, but i cannot understand how to update or delete attributes, group or dataset. I’m working with a stream of data in a nest js server, where elements are:

{ _id, timestamp, value }

I need to create a dataset for every _id that contains timestamp and value (1st and 2nd columns in a n x 2 table). I’m working with a big volume of data, so i cannot collect everything server-side before creating the dataset, so i need to create the dataset and update it at every step. As my last point, i'm not aware if the creation of the dataset using dynamic length is possible; but I can retrieve the final length of the dataset. Kind Regards, Carlo

carlogambi avatar Mar 28 '22 07:03 carlogambi

Hello, Carlo - Currently there is no resizing or appending possible with h5wasm. The only available option right now is to create a (whole, finished) HDF5 dataset from an existing array of values.

It's certainly possible to implement using parts of the HDF5 API, which provides a way of making resizable datasets - but it is not currently implemented. Implementation would involve:

at Dataset creation:

  • setting chunksize at dataset creation
  • setting the maxsize for each dimension (could include axes with unlimited max size)

when writing a point:

  • calling H5Dextend to resize the dataset
  • writing to a slice for the last point

None of these features is currently implemented in h5wasm (slice is enabled for reading, but not writing - and no option for passing chunk sizes, maxsize are used in the create_dataset function)

Out of curiosity, are you hoping to do this in the browser (where you would be accumulating points in memory no matter how you do it) or in node.js, where you could write directly to disk?

bmaranville avatar Mar 28 '22 19:03 bmaranville

You can create resizable datasets (must specify chunks and maxshape for create_dataset), and you can resize resizable datasets, and you can overwrite sections of data (so if you resize, you can write to the new extended region) in v0.4.11 released just now.

bmaranville avatar Apr 19 '23 14:04 bmaranville