Any ability to use relaxed JSON?
I have a bit of a strange use case where I really want jq to be able to take in relaxed json as input and output strict json.
Take a look at the following example with another tool called jj. In that json there's a trailing , at the end of the first object. JJ takes that json and removes the trailing commas, making the JSON valid.
echo '[{"id": 1, "name": "Arthur", "age": "21",},{"id": 2, "name": "Richard", "age": "32"}]' | jj -p
[
{
"id": 1,
"name": "Arthur",
"age": "21"
},
{
"id": 2,
"name": "Richard",
"age": "32"
}
]
$ echo '[{"id": 1, "name": "Arthur", "age": "21",},{"id": 2, "name": "Richard", "age": "32"}]' | jq --color-output
parse error: Expected another key-value pair at line 1, column 42
I would like the ability for jq to be less strict, perhaps using the relaxed json standard on input so that I can pass massive amounts of json through jq and not jj which is much, much slower.
Thanks for any help you can provide.
@davidawad - I won't speculate about the future directions of jq, but I think your main choice for the foreseeable future will be which tool to use amongst those that can transform your quasi-JSON to JSON. The jq FAQ lists a bunch of candidates. Of these, I have only verified that hjson does the job, but we'd be interested to know which one you choose.
What surprises me is your statement that jj is slower than jq. I have done some benchmarking using large JSON documents and jj is usually 10 to 15 times faster than jq when using the latter's regular (non-streaming) parser. Are we talking about the same jj? The one I know is at https://github.com/tidwall/jj
@pkoppstein sorry for replying so late!
Yes, almost every single time, jj was much slower than jq. Like, a 40 seconds vs instant output difference.
For examples of what kinds of files I was looking at, see github.com/davidawad/statedb.
need a forgiving mode
echo '{"trailing": "comma", }' | jq --slightly-forgiving // as opposed to full json5 like madness
@itchyny @wader plausible? machine generated contents or someone trying to avoid frequently changing "last lines" and seeing larger diffs would be pleased to have this sort of thing :)
If it's ok to include a jq file and performance is not an issue you can give https://github.com/wader/json5.jq a try. I would guess it's not that technically hard to support this with jq but hard to know where to draw the line of features to support.
@wader my perspective on this:
-
Short-term: Allowing trailing commas (
--allow-trailing-comma) would address most factory-like use cases, where the producer may not know if there are more items to process. This change would also makegit diffcleaner when adding new items at the end, as it avoids modifying the last line unnecessarily. -
Long-term: It would be great to include
json5.jqin stockjqas an experimental feature, perhaps under a--exp-json5flag. This could evolve over time, guided by feedback from real-world usage.
JSON/JSON5 is a lost cause. The responsibility of handling comments (blast it, Crockford) is up to each of a million applications.
I'm not opposed to any JSON related project adopting JSON5, especially official jq, and JSON Schema. And OpenAPI. And prominent first and third party per-programming language libraries. Any adoptions there are very much welcome.
What I mean is, the JSON (non-5) specification fundamentally cannot change. Because of the legion of hopelessly non-5 parsers.
VSCode abuses the .JSON file extension for decidely non-JSON contents, making it difficult to apply automated linting.
YAML has been replacing JSON for configuration files.
However, YAML has many subtle parsing quirks regarding strings and truthy values.
TOML offers a modern successor, seen here and there in configuration files.
If OpenAPI made a recommendation in favor of TOML rather than JSON, then that momentum in the REST service space could pull the rest of tech along with it.
Yet other network services use a variety of more compact binary serialization protocols.
On the whole, JSON and other formats unfriendly to comments need to phase out.