json2parquet icon indicating copy to clipboard operation
json2parquet copied to clipboard

Issue #4 Improve load json data

Open sojovi opened this issue 6 years ago • 2 comments

Couple of changes :

  1. Changed read json file using for in file (faster and cleaner according to documentation)
  2. Merged load schema function and convert data with column name in a single function.
  3. Not sure about the efficiency of the function to concatenate lists preserving order and removing duplicates f7, but I couldn't find a better way (taken from SO :| )
  4. Please excuse me in case my style is not appropiate to contribute to a project of this kind, I'm a newbie on this sort of stuff.

sojovi avatar Jan 24 '19 23:01 sojovi

Hey, thanks for this, I'll need to digest the code flow a bit more, but looks ok in general. I don't understand the usage of _col = column_data.get(column, [None]*(row_count-1)), but it may become clearer when I walk through it again.

andrewgross avatar Feb 19 '19 15:02 andrewgross

There are a couple of things that I would rewrite (added comments in the PR). Can I change them and send a new PR ?

sojovi avatar Feb 19 '19 16:02 sojovi