deck.gl-layers icon indicating copy to clipboard operation
deck.gl-layers copied to clipboard

parse in batches or using workers

Open mapsgeek opened this issue 1 year ago • 1 comments

this is more of a help request than an issue, i would like to have an example or help in reading/parsing files in batches similar to @loaders.gl for example we can do this:

  const batchIterator = await parseInBatches(file, loaders, {
    worker: true,
    batchSize: 4000,
    batchDebounceMs: 50,
    metadata: true,
  });

 for await (const batch of batchIterator) {
    for (let i = 0; i < batch.data.length; i += 1) {
      batches.push(batch.data[i] as never);
    }
  }

so the file gets loaded and parsed without blocking the main thread, i have been exploring with js web workers so my solution would be something like this and call the worker on file upload event

onmessage = function (event) {
  // console.log('Received message from the main thread:', event.data);

  const wasmTable = readGeoParquet(new Uint8Array(event.data));
  const jsTable = tableFromIPC(wasmTable.intoTable().intoIPCStream());
};

not sure yet if that's the right approach but also i'm confused about the option earcutWorkerPoolSize and earcutWorkerUrl from the layers options if they can be more effective way to solve this issue so more information about this would be helpful.

Thanks

mapsgeek avatar Feb 24 '24 16:02 mapsgeek

You can create one layer per arrow batch. So if you have an incoming stream of Arrow batches, you can create an async iterable of layers, and deck should be able to handle that.

The earcut worker is separate and handles polygon triangulation.

At some point geoparquet-wasm should be able to expose a stream of batches. It was already mostly implemented for the non-spatial case in parquet-wasm. I don't know when I'll have time to get to that, though. See also discussion here https://github.com/geoarrow/geoarrow-rs/issues/283

kylebarron avatar Feb 24 '24 16:02 kylebarron