h5web icon indicating copy to clipboard operation
h5web copied to clipboard

Allow client-generated exports of datasets

Open bmaranville opened this issue 2 years ago • 6 comments

Is your feature request related to a problem?

Providers (e.g. h5wasm) that read the HDF5 file directly in the browser are unable to take advantage of server-side-generated exports of datasets or slices (URL-based)

Requested solution or feature

Allow client-generated exports to be included in the menu in addition to backend-supported URL endpoints e.g. (lazy: Function | URL: string) instead of just URL: string

Alternatives you've considered

Copying/pasting options discussed in #35 could be relevant, but exports would be nice.

Additional context

The numpy target format (.NPY) is exceptionally well-documented and would be very easy to implement client-side, as well as CSV/TSV for < 3 dims. TIFF is probably possible with a library for 2 dims.

bmaranville avatar Apr 07 '22 18:04 bmaranville

@bmaranville, with the next release, you will be able to pass a getExportURL prop to H5WasmProvider to generate your own client-side exports. More info here, along with a client-side export example.

The H5WasmProvider could definitely benefit from having a built-in implementation of getExportURL, even partial (only 1/2D numeric datasets to CSV, for instance), so please do share with us what you come up with.

axelboc avatar Oct 26 '22 12:10 axelboc

Sounds like you've done all the hard stuff... I'll definitely share any progress I make.

bmaranville avatar Oct 26 '22 12:10 bmaranville

Something is wrong with the types being given for getExportURL, and the example pseudocode can't even work: value.forEach and value.map are both forbidden by the compiler, and clearly are needed for writing this function. Here is the error:

[TypeScript] Property 'forEach' does not exist on type 'never'.
/Users/bbm/dev/h5web/apps/demo/src/h5wasm/H5WasmApp.tsx:31:15
    29 |         assert('forEach' in value);
    30 |         input = [];
  > 31 |         value.forEach((val) => input.push(val));
       |               ^^^^^^^
    32 |         // input = (value as Value[]).map((v: unknown) => [v]);
    33 |       }
    34 |       else if (dataset.shape?.length === 0) {
[TypeScript] Parameter 'val' implicitly has an 'any' type.
/Users/bbm/dev/h5web/apps/demo/src/h5wasm/H5WasmApp.tsx:31:24
    29 |         assert('forEach' in value);
    30 |         input = [];
  > 31 |         value.forEach((val) => input.push(val));
       |                        ^^^
    32 |         // input = (value as Value[]).map((v: unknown) => [v]);
    33 |       }
    34 |       else if (dataset.shape?.length === 0) {

I am supplying a function signature as

function getExportURL<T extends Dataset>(format: unknown, dataset: T, selection: unknown, value: Value<T>) {

EDIT: I think I got something to work, by converting Value to an Array. The type definitions in this project are very detailed and I struggle sometimes to figure them out even when the underlying code (logic) is clear.

bmaranville avatar Oct 27 '22 16:10 bmaranville

Hey, sorry, I wasn't expecting you to try it out before the release. 😆 It did occur to me after closing the issue that some of the types/assertions would probably need to be exported to make implementing the function more straightforward. I'll investigate and keep in touch.

axelboc avatar Oct 28 '22 07:10 axelboc

Alright, so with #1249, value.forEach will work as expected and I've exported the type of the getExportURL to make it easier to extract the function from the JSX.

Now, there's still something that doesn't sit quite right with me: it's the fact that it's not possible to narrow down the type of the values contained inside the value array/typed array by narrowing down the dtype of the dataset - e.g. with H5Web's type guards (which are internal for now but could be exported for this purpose):

const getExportURL: GetExportURL = (format, dataset, selection, value) => async () => {
  if (hasNumericType(dataset)) {
    value.forEach(val => {
      // I would expect `val` to be `number` here, but it remains `unknown`
    })
  }
};

For now, the only options are to write a type guard specifically for the value array, or to check each value individually:

// With a `value` array type guard
function isNumericArray(arr): arr is number[] | TypedArray {
  return typeof arr[0] === 'number';
}

if (isNumericArray(value) {
  value.forEach(val => { /* `val` is `number` */ });
}

// ... or if you know the provider always returns `TypedArray`
if (isTypedArra(value)) { ... }

// With an inline type guard to check each individual value
value.forEach(val => {
  if (typeof val === 'number') { ... }
});

No idea if this problem is solvable or not at this point.

axelboc avatar Oct 28 '22 14:10 axelboc

FYI, in #1485 I'm adding client-side JSON-export capability to the h5wasm data provider.

axelboc avatar Sep 01 '23 14:09 axelboc