Feature request: Support Arrow data
It would be nice if codon would support Arrow data in the future, besides numpy arrays.
nanoarrow of the Arrow project should have a relatively easily embeddable implementation:
The nanoarrow libraries are a set of helpers to produce and consume Arrow data, including the Arrow C Data, Arrow C Stream, and Arrow C Device, structures and the serialized Arrow IPC format. The vision of nanoarrow is that it should be trivial for libraries to produce and consume Arrow data: it helps fulfill this vision by providing high-quality, easy-to-adopt helpers to produce, consume, and test Arrow data types and arrays.
The nanoarrow libraries were built to be:
Small: nanoarrow’s C runtime compiles into a few hundred kilobytes and its R and Python bindings both have an installed size of ~1 MB.
Easy to depend on: nanoarrow’s C library is distributed as two files (nanoarrow.c and nanoarrow.h) and its R and Python bindings have zero dependencies.
Useful: The Arrow Columnar Format includes a wide range of data type and data encoding options. To the greatest extent practicable, nanoarrow strives to support the entire Arrow columnar specification (see the Arrow implementation status page for implementation status).
https://arrow.apache.org/nanoarrow/latest/index.html
Thanks for the suggestion, @ghuls -- we're planning to support this via our Codon-native Pandas which should be coming soon. Should also be straightforward to support Arrow->NumPy directly as well.
Merging with #608