configlet
configlet copied to clipboard
lint: consider the JSON parsing/deserialization design
Main options:
-
std/json
1a. The approach so far: parse into aJsonNode
and work only with that. 1b. Parse into aJsonNode
, then unmarshall into some object usingto
. 1c. Plusstd/jsonutils
-
Araq/packedjson
- keeps everything as a string. Lower memory usage thanstd/json
, and sometimes faster. -
planetis-m/eminim
- deserializes usingstd/streams
directly to anobject
. Doesn't fully support object variants, but maybe that isn't a problem for us. -
status-im/nim-json-serialization
- deserializes usingnim-faststreams
directly to anobject
. Probably the most mature third-party option. Currently has a large dependency tree, includingchronos
andbearssl
. -
treeform/jsony
- deserializes fromstring
directly to anobject
.
(Note that disruptek/jason
is serialization-only).
There are also some more obscure ones that I haven't tried, and don't know anything about:
Some of the above are possibly too lenient or require special handling in some edge cases.
Summary:
Library | Permits a trailing comma? | Permits comment? | Duplicate key handling |
---|---|---|---|
Ruby stdlib json | :x: | :white_check_mark: | Uses last value |
std/json |
:white_check_mark: | :white_check_mark: | Uses last value |
std/json patched |
:x: | :x: | Uses last value |
packedjson |
:white_check_mark: | :white_check_mark: | Uses first value |
eminim |
:white_check_mark: | :white_check_mark: | Uses last value |
json_serialization |
:x: | :white_check_mark: | Uses last value |
jsony |
:white_check_mark: | :x: | Uses last value |
For example:
- There is no correct behaviour for a duplicate key - one library may produce an error, another may silently use the value of the first key, and another may silently use the value of the last key.
-
std/json
permits a trailing comma, and comments with//
and/* */
. This is the main reason that it took a while to tick the boxes for "the file must be valid JSON" in https://github.com/exercism/configlet/issues/249. But we now have own patchedstd/json
with stricter parsing.configlet lint
must exit with a non-zero exit code for a trailing comma because the Ruby library that parses it later produces an error for a trailing comma. - Some libraries may use a default value when the key is missing, which we might want to distinguish from e.g. a value that is the empty string.
- Edge cases around a literal
null
.
See also:
- https://einarwh.wordpress.com/2020/05/08/on-the-complexity-of-json-serialization/
- https://labs.bishopfox.com/tech-blog/an-exploration-of-json-interoperability-vulnerabilities
I'd suggest that jsony
or nim-json-serialization
might be best in the long-term. But maybe it's better to stick with the current approach until we've implemented all the linting rules, and refactor it later.
One advantage of the current approach is that it's more low-level, which might better ensure that we're "checking the JSON file itself" rather than "checking that each value is valid when parsed with library X".