json2parquet
json2parquet copied to clipboard
Issue #4 Improve load json data
Couple of changes :
- Changed read json file using for in file (faster and cleaner according to documentation)
- Merged load schema function and convert data with column name in a single function.
- Not sure about the efficiency of the function to concatenate lists preserving order and removing duplicates f7, but I couldn't find a better way (taken from SO :| )
- Please excuse me in case my style is not appropiate to contribute to a project of this kind, I'm a newbie on this sort of stuff.
Hey, thanks for this, I'll need to digest the code flow a bit more, but looks ok in general. I don't understand the usage of _col = column_data.get(column, [None]*(row_count-1))
, but it may become clearer when I walk through it again.
There are a couple of things that I would rewrite (added comments in the PR). Can I change them and send a new PR ?