json2parquet icon indicating copy to clipboard operation
json2parquet copied to clipboard

Convert JSON files to Parquet using PyArrow

Results 16 json2parquet issues
Sort by recently updated
recently updated
newest added

` from json2parquet import load_json, ingest_data, write_parquet, write_parquet_dataset File "directory/file_name.py", line 5, in from json2parquet import load_json, ingest_data, write_parquet, write_parquet_dataset ImportError: cannot import name 'load_json' from 'json2parquet'` I have installed...

Add feature to partition output files by specific column/s in the schema. Initially only by column, no expressions allowed to partition data.

This will never be perfect in python, but right now its pretty disgusting what we do when we load JSON and convert it to a columnar format.

Check if PyArrow data structures match up with what JSON loads matches up with what Redshift has

Anywhere we can get a valid schema from should work fine

I know its still being worked on with PyArrow, but since it is JSON it probably should be considered.