arrow-tools icon indicating copy to clipboard operation
arrow-tools copied to clipboard

A collection of handy CLI tools to convert CSV and JSON to Apache Arrow and Parquet

Results 9 arrow-tools issues
Sort by recently updated
recently updated
newest added

Hi @domoritz As others have mentioned, these tools are really powerful. Thanks for the great work. I'd like to add these to the [scoop](https://scoop.sh) repositories. Scoop is a very convenient...

I tried the new releases but get an error. ``` > csv2arrow data/simple.csv -n Schema: { "fields": [ { "name": "a", "data_type": "Int64", "nullable": true, "dict_id": 0, "dict_is_ordered": false, "metadata":...

bug

Hi there! First of all thank you for the tooling, it's incredibly powerful. I have been using `json2parquet` to process some intricate `.jsonl` files. I have had a good time...

Thanks for making these tools. They are great. Would help non-Rustaceans to have schema examples for nontrivial types: Decimal128, Dictionary etc

See https://github.com/domoritz/json2parquet/issues/99 by @cardi

bug

Some basic CI testing would be great to prevent regressions.

enhancement

It's safest to infer the schema on the entire dataset. When the dataset is larger than RAM, this is currently not possible via stdin as the implementation in #10 and...

enhancement

Not sure I am doing this right, but I am trying to convert a CSV containing some timestamp to a parquet file. Sample CSV ``` 072e4a64-2ffb-437c-9458-4953abaa7a20,1,2023-01-18 23:05:10,104,-1,0 072e4a64-2ffb-437c-9458-4953abaa7a20,2,2023-01-18 23:05:10,104,-1,0 072e4a64-2ffb-437c-9458-4953abaa7a20,4,2023-01-18...