zarrita.js icon indicating copy to clipboard operation
zarrita.js copied to clipboard

BoolArray fails to get

Open keller-mark opened this issue 2 years ago • 4 comments

https://observablehq.com/d/7152024fe1caf825 Using get() on a BoolArray seems to not work as expected - the returned value is { data: {}, shape: [34406], stride: [1] }

keller-mark avatar Sep 27 '23 16:09 keller-mark

Ah I see now the empty object is an iterator so I can do

Array.from((await zarr.get(arr)).data);

so perhaps not a bug after all, but now I have to always check for whether .data is an iterator on my end

const data = await zarr.get(arr);
if(data.data?.[Symbol.iterator]) {
  return Array.from(data.data);
}
return data.data;

keller-mark avatar Sep 27 '23 16:09 keller-mark

Maybe the user should opt-in to getting an iterator

zarr.getIterator(arr)

keller-mark avatar Sep 27 '23 16:09 keller-mark

Yeah, the idea was to keep the data in a TypedArray-like object for as long as possible (i.e., a strided view of the underlying bytes). But maybe this is more trouble than it's worth if there aren't use cases for keeping the underlying bytes.

This case happens with the string/bool array types, which is why I introduced the zarr.Array.is type guard. You could do something like:


if (arr.is("string") || arr.is("bool")) {
   data = Array.from((await get(arr)).data);
} else {
   data = (await get(arr)).data;
}

but maybe that's still not very ergonomic. We could probably wrap this in a separate API as you suggested, which will coerce the typed arrays into object arrays.

manzt avatar Sep 27 '23 23:09 manzt

A thought I had yesterday. Maybe we could have a type-aware helper for coercing the data types:

let { data } = zarr.refineChunk(await get(array), {
   string: ({ data, shape, stride }) => ({ data: Array.from(data), shape, stride }),
   boolean: ({ data, shape, stride }) => ({ data: Array.from(data), shape, stride }),
})

Will need to think about it more, but curious to hear your thoughts.

manzt avatar Nov 03 '23 14:11 manzt