spyql
spyql copied to clipboard
Support reading and writing of JSON objects (not JSON lines)
Currently SPyQL only allows to read and write JSON lines. Writing JSON arrays can be done using the dict_agg
, aggregating everything into an array and writing an JSON with a single line.
The idea is to add an argument lines=True
to json and orjson writers and processors. The processor should be able to handle single object files as well as arrays of objects or arrays of scalars. When lines
is False
the processor should load the full input into memory and then parse it. While this is not ideal, it is the most straightforward implementation. In addition, arrays of JSON shouldn't be used for large data, in that cases JSON lines should be used instead.
The writer should write an array of objects when lines
is False
.
BTW, reading JSON arrays is done today by leveraging jq in the command-line:
jq `.[]` myfile.json | spyql "SELECT .aproperty FROM json"