parquet-tools
parquet-tools copied to clipboard
easy install parquet-tools
https://github.com/apache/parquet-format
When piping `parquet-tools csv` output to `less`, I consistently get a BrokenPipeError when `less` is closed: ``` > parquet-tools csv my.parquet | less Traceback (most recent call last): File "/.../bin/parquet-tools",...
I am not able to generate the schema for parquet file . Below is the error. parquet-tools schema Sample.parquet usage: parquet-tools [-h] {show,csv,inspect} ... **parquet-tools: error: argument {show,csv,inspect}: invalid choice:...
Fix #54
It'd be great if you'd support moto 5.x. moto 5 brings some API changes, but seems to be much simpler overall.
If users specify multiple parquet files which has different column structure, the outcome is concat with column direction (not row direction). ```bash $ poetry run parquet-tools csv ./tests/test1.parquet ./tests/test0.parquet one,two,three,a,b,c,d...
csv export is nice, but if a text field contains itself the separator, the output then cannot be correctly interpreted 1. Exporting as JSON would alleviate the problem 2. Another...
Could be possible to have a parameter that behaves like truncate in spark.show()? Currently, if you try to show a data frame with a column that has a long string,...
Showing row groups, the compression used, metadata and encoding. e.g. ``` docker run -v $PWD/houseprices:/data markhneedham/pq meta /data/house_prices.parquet File path: /data/house_prices.parquet Created by: parquet-cpp version 1.5.1-SNAPSHOT Properties: (none) Schema: message...
It would be useful to print column sizes with `inspect`.