tremor-runtime
tremor-runtime copied to clipboard
Add parquet format to codecs
Describe the problem you are trying to solve
Tremor cannot encode/decode parquet as a codec.
Describe the solution you'd like
Would be nice to have parquet as a supported codec format.
Notes
Official rust implementation of parquet can be found at the apache arrow project: https://github.com/apache/arrow-rs
Oh yeah, lets make that happen!
You might want to look at https://github.com/jorgecarleitao/parquet2 as well. It is a more idiomatic rewrite of parquet.
If you also want to process data via IPC (e.g., network, UNIX pipes, shared mmap), then Arrow IPC would offer higher interop.
Arrow itself has the ability to read/write Parquet, which is typically only used as on-disk file format.
That's definetly worht looking at too! We generally try to separate the encoding (arrow/parquet) from the transport (UNIX, network, mmap, etc) that way the parts become interchangeable (i.e. we have a UNIX socket, a TCP, and a upd connector, so adding Arrow encoding we'd unlock all those transports at once :D )
https://docs.rs/arrow/latest/arrow/index.html adding this for keeping