jq
jq copied to clipboard
Adding `sort_keys` filter as a flexible alternative to `--sort-keys`
Hi!
In my alternative jq implementation jaq, I would like to implement the functionality of jq's --sort-keys
command-line option, but without creating a command-line option.
In particular, in the corresponding issue, I thought about introducing a new filter sort_keys
:
def sort_keys: walk(if . >= {} then reduce (keys[] as $k | { ($k): .[$k] }) as $o ({}; . + $o) end);
This filter would allow for the same functionality as today's --sort-keys
: Where jq --sort-keys 'f'
is currently used, jq 'f | sort_keys'
could be used alternatively. This would also allow for sorting only specific objects by keys, where --sort-keys
works only on all output unconditionally.
To keep compatibility between jaq and jq, I would like to synchronise my actions with you. So I'd like to ask the wider jq community: What do you think about this? How would you feel about including sort_keys
into a future jq release?
If your response is positive, I would be happy to make a PR that includes sort_keys
into builtin.jq
.
Think it could be a nice addition. By "allow for sorting only specific objects by keys" you mean to be able to recursively sort part of a object e.g. .a |= sort_keys
?
Some thoughts and questions:
- Could possibly
sory_by_keys
be an alternative name? -
. => {}
is fancytype == "object"
? - Is there some performance reason to split the input object into small one-key-objects and then merge them instead of doing something like
def sort_keys: walk(if . >= {} then . as $o | reduce keys[] as $k ({}; .[$k] = $o[$k]) end);
?
Think it could be a nice addition. By "allow for sorting only specific objects by keys" you mean to be able to recursively sort part of a object e.g.
.a |= sort_keys
?
I like this too.
Some thoughts and questions:
* Could possibly `sory_by_keys` be an alternative name?
Since the existing command-line option is already --sort-keys
, naming the new builtin something close to that seems best, IMO.
* `. => {}` is fancy `type == "object"`?
Huh. In jq the greater-than-or-equal operator is >=
, not =>
. Comparisons of values of different types return the type of one minus the type of the other, with types expressed numerically. The only input value that would cause . >= {}
to be true is {}
.
* Is there some performance reason to split the input object into small one-key-objects and then merge them instead of doing something like `def sort_keys: walk(if . >= {} then . as $o | reduce keys[] as $k ({}; .[$k] = $o[$k]) end);`?
I would definitely implement this as a C-coded builtin in jq to optimize this. And probably just rewrite each object's insertion order by re-writing all the next
fields of all the buckets to match sorted key order.
I should add that I wish keys
had been a special function (like empty
) that streams the object's keys or array's indices rather than outputting an array of keys. We should probably add a streamkeys
or keyss
or some such built-in that does just that. EDIT: Or maybe special syntax for this, like .[!]
(since Bash uses !
in ${!var[@]}
to refer to keys instead of values.
Ah, so here . => {}
(or more likely, . >= {}
really means . != {}
, and assumes .
is an object.
Yes sorry for my shitty typing, i meant . >= {}
sort
does not sort recursively, map
does not map recursively, and neither should sort_keys
operate recursively.
Furthermore, a non-recursive sort_keys
(*) is perfectly useful in itself, and the recursive version of sort_keys
can be easily enough implemented using the non-recursive version, so adding the non-recursive version should be more than sufficient
-
This def has the semantics I have in mind:
def sort_keys: to_entries | sort | from_entries;
Since the existing command-line option is already
--sort-keys
, naming the new builtin something close to that seems best, IMO.
Mm agree, that make sense. Also if we ever would wants a _by
variant it makes more sense sort_keys_by(f)
compared to sort_by_keys_by
😬
I would definitely implement this as a C-coded builtin in jq to optimize this. And probably just rewrite each object's insertion order by re-writing all the
next
fields of all the buckets to match sorted key order.
Also makes sense 👍
sort does not sort recursively, map does not map recursively, and neither should sort_keys operate recursively.
That's a good point. But would be confusing that sort_keys
would be non-recursive but --sort-keys
would? Hmm
sort does not sort recursively, map does not map recursively, and neither should sort_keys operate recursively.
That's a good point. But would be confusing that
sort_keys
would be non-recursive but--sort-keys
would? Hmm
Yes, I think so, but a non-recursive version is needed too.
a non-recursive version is needed too.
Here's a thought: define sort_keys
non-recursively but in a way that
makes it trivial to use recursively, e.g. by walk(sort_keys)
.
An appropriate def would be:
def sort_keys:
if type == "object" then to_entries | sort | from_entries else . end;
Regarding the tension between having the command-line option
--sort-keys
be recursive but the builtin sort_keys
be
non-recursive -- if this is indeed going to be a significant obstacle,
then how about deprecating the long form --sort-keys
in favor of an
alternative long form name for the -S option?