Feature: compact output with line folding
Description
I would like to see a feature that folds compact json content at N characters while still keeping it as valid json. Such a feature would allow users to easily produce valid compact json that could be embedded into documents that restrict line length.
Attempted alternatives
fold
fold produces invalid json:
# lsblk -J | jq -c . | fold -s
{"blockdevices":[{"name":"nbd0","maj:min":"43:0","rm":false,"size":"0B","ro":fal
se,"type":"disk","mountpoints":[null]},{"name":"nbd1","maj:min":"43:32","rm":fal
se,"size":"0B","ro":false,"type":"disk","mountpoints":[null]},{"name":"nbd2","ma
j:min":"43:64","rm":false,"size":"0B","ro":false,"type":"disk","mountpoints":[nu
ll]},{"name":"nbd3","maj:min":"43:96","rm":false,"size":"0B","ro":false,"type":"
disk","mountpoints":[null]},{"name":"zram0","maj:min":"253:0","rm":false,"size":
"15.7G","ro":false,"type":"disk","mountpoints":["[SWAP]"]},{"name":"vda","maj:mi
n":"254:0","rm":false,"size":"362.4M","ro":true,"type":"disk","mountpoints":[nul
l]},{"name":"vdb","maj:min":"254:16","rm":false,"size":"8T","ro":false,"type":"d
isk","mountpoints":[null],"children":[{"name":"vdb1","maj:min":"254:17","rm":fal
se,"size":"460.4G","ro":false,"type":"part","mountpoints":["/etc/hosts","/etc/ho
stname","/etc/resolv.conf"]}]},{"name":"vdc","maj:min":"254:32","rm":false,"size
":"1G","ro":false,"type":"disk","mountpoints":["[SWAP]"]}]}
Regex
Using a regex can sometimes produce valid json, but would be unreliable, and suboptimal:
# lsblk -J | jq -c . | sed -E 's/(.{1,80}),/\1,\n/g'
{"blockdevices":[{"name":"nbd0","maj:min":"43:0","rm":false,"size":"0B",
"ro":false,"type":"disk","mountpoints":[null]},{"name":"nbd1","maj:min":"43:32",
"rm":false,"size":"0B","ro":false,"type":"disk","mountpoints":[null]},
{"name":"nbd2","maj:min":"43:64","rm":false,"size":"0B","ro":false,"type":"disk",
"mountpoints":[null]},{"name":"nbd3","maj:min":"43:96","rm":false,"size":"0B",
"ro":false,"type":"disk","mountpoints":[null]},{"name":"zram0","maj:min":"253:0",
"rm":false,"size":"15.7G","ro":false,"type":"disk","mountpoints":["[SWAP]"]},
{"name":"vda","maj:min":"254:0","rm":false,"size":"362.4M","ro":true,
"type":"disk","mountpoints":[null]},{"name":"vdb","maj:min":"254:16","rm":false,
"size":"8T","ro":false,"type":"disk","mountpoints":[null],
"children":[{"name":"vdb1","maj:min":"254:17","rm":false,"size":"460.4G",
"ro":false,"type":"part","mountpoints":["/etc/hosts","/etc/hostname",
"/etc/resolv.conf"]}]},{"name":"vdc","maj:min":"254:32","rm":false,"size":"1G",
"ro":false,"type":"disk",
"mountpoints":["[SWAP]"]}]}
I can see this being useful in some situations, but a problem comes to mind:
Given that there is no way to split values between lines in JSON, this is an impossible task when folding a document containing a value with a textual representation exceeding N characters.
Any implementation of such a feature would be forced to either allow exceptions to occur in the output or give up when encountering a value that won't fit the constraint.
To anyone wanting such a feature, I suggest seeking an alternative solution, such as encoding the JSON in a format that allows breaking lines and is suitable for the document where the JSON is being embedded. Base64 is a fairly universal option. Sure, it's not human-readable, but I'd argue compact JSON isn't really human-readable either and you shouldn't be relying on it being quasi human-readable.
Given that there is no way to split values between lines in JSON, this is an impossible task when folding a document containing a value with a textual representation exceeding N characters.
Correct, so that would be a necessary exception. I've already written a tool that does this, and that condition is part of it. I think it would be good to have jq do it natively. I understand if the jq devs reject the feature request though.
Base64 is a fairly universal option. Sure, it's not human-readable, but I'd argue compact JSON isn't really human-readable either and you shouldn't be relying on it being quasi human-readable.
I disagree. You can grep for contents in a compact json, but not base64.
How should long strings be displayed?
@itchyny in my script, the long lines are just left as long lines. It's impossible to shorten them while maintaining json compatibility, so json compatibility is favored over compactness.
$ curl -fsSL https://httpbin.io/json | compact-json-fold.py -n 30
{"slideshow":{"author":
"Yours Truly","date":
"date of publication",
"slides":[{"title":
"Wake up to WonderWidgets!",
"type":"all"},{"items":
["Why <em>WonderWidgets</em> are great",
"Who <em>buys</em> WonderWidgets"]
,"title":"Overview","type":
"all"}],"title":
"Sample Slide Show"}}
In many use cases, simply converting to yaml may bring the desired effect? YAML supports wrap at n-characters.
Given that there is no way to split values between lines in JSON, this is an impossible task when folding a document containing a value with a textual representation exceeding N characters.
Correct, so that would be a necessary exception. I've already written a tool that does this, and that condition is part of it.
I am interested in this tool, because I do indeed use YAML for manually maintaining json data, then simply convert back to json for the api that require it. Unfortunately, it is not unusual for my data to contain \n or \ to denote 'new line' or 'not wrap' when the data itself is simply text with two carriage returns to denote paragraphs. --I'm interested in your unwrap tool because I've not found a good solution for this. awk can fixup a lot, but I've not landed on a definitive expression to reverse the extra characters correctly in all cases.
@georgalis I've moved my script from a private repo into my public utilities repo https://github.com/danielhoherd/pub-bin/blob/main/compact-json-fold.py