arrow icon indicating copy to clipboard operation
arrow copied to clipboard

[JS] Improve JS documentation on how to read/deserialize arrow data

Open bluehat974 opened this issue 2 years ago • 3 comments

Describe the enhancement requested

cc @domoritz

Current JS documentation is not clear on how to read & manipulate the data from Apache Arrow JS

JS version of Apache Arrow is used in JS environment (DuckDB Wasm, ObservableHQ, Arquero) and people are asking on how to properly read the data, but there is no clear answer https://github.com/duckdb/duckdb-wasm/pull/1418

There is some documentation to read arrow data or deserialize to JSON https://duckdb.org/docs/api/wasm/query.html#arrow-table-to-json https://observablehq.com/@theneuralbit/using-apache-arrow-js-with-large-datasets

but this examples should be unified to the original Apache Arrow JS documentation https://github.com/apache/arrow/blob/main/js/README.md

Some ideas of code example to provide to the documentation:

  • Best way to read data without deserialize into JSON version
  • Explain how to take advantage of JS Proxy to read data faster instead of deserialize to JSON
  • If serialization is required, how to do it properly
  • How to convert column to row
  • How to read nested type (STRUCT, MAP, DICTIONNARY...)
  • How to cast arrow type (from DECIMAL to DOUBLE)
  • How to cast arrow type (LONG, DOUBLE, DECIMAL) to desired js type (bigint, number, string...)

Component(s)

Documentation, JavaScript

bluehat974 avatar Sep 25 '23 13:09 bluehat974