json2nd
json2nd copied to clipboard
Filter to convert json array data, and plain JSON, to NDJSON (newline delimited JSON)
#+TITLE: json2nd
- Usage
Convert JSON with a top-level array to NDJSON (new-line delimited JSON, aka JSONL):
#+begin_src sh cat large_array.json | json2nd > object_per_line.json #+end_src
Like a lot of filter programs you can switch from using STDIN to a list of filenames, so now it acts a bit like a JSON cat that converts arrays:
#+begin_src sh json2nd large_array1.json large_array2.json > object_per_line.json #+end_src
If the array you want to unpack is below the surface:
#+begin_src json { "stuff": { "things" : [1, 2, 3] } } #+end_src
then you can use the -path flag to extract it:
#+begin_src json2nd -path stuff.things file.json #+end_src
For more see [[./doc/other_usage.org][other usage scenarios]], and [[./doc/json_considerations.org][JSON considerations]].
- Installation
** Using Go
Assuming your ~$GOBIN~ directory is somewhere in your ~$PATH~ it's as simple as:
#+begin_src sh go install github.com/draxil/json2nd@latest #+end_src
** Github releases
There are builds of release points on github. Grab the relevent build from [[https://github.com/draxil/json2nd/releases][the github releases]] page, right now these just contain a binary and docs.
- Plans / what about XYZ?
** Why are your error messages so bad?
Since we moved off a proper parser the first priorities were: speed, memory use and working.
Being helpful when it goes wrong is pretty much the next goal.
** Windows style line endings ("\r\n")?
Maybe. Honestly would be getting beyond the simplicity of this thing, but I can see how it could be useful to some people. Bug me?
** What about ~jq~?
[[https://stedolan.github.io/jq/][jq]] is great! And a lot of the time I'm running this so I can slice a file up so I can run ~jq~ more quickly! This is simpler to use as it's a single-use tool, and should be faster than ~jq~. Also as a lot of the use case for this is people sending me in advisably large files, we don't load the whole thing into memory. ~jq~ unavoidably does and it kills my machine on some of my 4G+ (I kid you not) examples.
Sometimes you want UNIX philosophy sometimes you need an atomic chainsaw. Why not have both.
** Why aren't you using a JSON parser?
Originally I did, but they were either slow or insisted on having the entire file in memory, and I want this to cope with very large files. See also [[./doc/json_considerations.org][JSON considerations]].
- Author / credits
Joe Higton [email protected]
- [[https://www.reddit.com/user/skeeto/][/u/skeeto/]] for some helpful comments.
- Licence
Please see the [[./LICENSE][licence file]].