jq icon indicating copy to clipboard operation
jq copied to clipboard

Contrary to the manual, `.a.b.c` is not quite the same thing as `.a | .b | .c`

Open nicowilliams opened this issue 1 year ago • 5 comments

The manual says:

Note that .a.b.c is the same as .a | .b | .c.

It seems that way, but this is not true when it comes to assignments:

: ; jq -cn '.a.b.c = 1'
{"a":{"b":{"c":1}}}
: ; jq -cn '.a | .b | .c = 1'
{"c":1}

The difference lies in assignment operators having higher precedence than |, so .a.b.c = 1 is the same as (.a | .b | .c) = 1, not .a | .b | .c = 1.

This difference is important to defining the concept of "path expressions" -- a task that has proved a bit slippery.

Also, it's important to explaining why $a[0] = 1 is an error (namely: $a is a part of the left-hand side of the assignment, but $a isn't a valid path expression). It's almost certainly surprising that $a[0] = 1 is an error while $a | .[0] = 1 is not. It was surprising to me 10 years ago, and while I've long understood exactly why that is an error, it's not clear to me that we shouldn't make it work by having the jq compiler do the user the courtesy of transforming $a[0] = 1 to $a | .[0] = 1.

One could even imagine having the jq compiler transform ($a[0] = 1) | ... to ($a | .[0] = 1) as $a | ....

Such transformations would make the jq language feel a lot more like a traditional procedural language while not actually changing jq into being a procedural language. This sort of evolution has been a bit of a common theme for functional languages.

I'm not suggesting that we should make these changes now, but we should probably change the manual to say:

Note that .a.b.c is the same as (.a | .b | .c).

nicowilliams avatar Aug 19 '23 23:08 nicowilliams

That sentence was evidently not intended to be taken out of context, the context being the construction of filters to form a pipeline, not the construction of LHS expressions.

Since the sentence does occur in a paragraph by itself, it might help avoid confusion by making the context explicit, e.g. along the lines:

The filter .a | .b can be written as .a.b, and similarly the "pipe" character can be omitted in expressions such as ."a" | ."b"and .["a"] | .[]`.

pkoppstein avatar Aug 20 '23 01:08 pkoppstein

That sentence was evidently not intended to be taken out of context, the context being the construction of filters to form a pipeline, not the construction of LHS expressions.

Yes, but that's probably the best place to add parenthesis. Users might not notice this difference otherwise.

Since the sentence does occur in a paragraph by itself, it might help avoid confusion by making the context explicit, e.g. along the lines:

The filter .a | .b can be written as .a.b, and similarly the "pipe" character can be omitted in expressions such as ."a" | ."b"and .["a"] | .[]`.

This construction has the same problem if what follows .a | .b were an assignment.

I propose merely adding the missing parenthesis. Those parenthesis would be a clue to users that they should eventually understand jq's precedence rules, and it's the smallest possible edit to that paragraph that corrects the problem.

nicowilliams avatar Aug 20 '23 03:08 nicowilliams

@nicowilliams wrote:

This construction has the same problem if what follows .a | .b were an assignment.

In the context in which it appears, it seems to me that the term "filter" is referring to filters in the ordinary sense, not the LHS of assignment expressions. Anyway, if you think further clarification that we mean here to exclude the LHS of such expressions, then by all means have at it.

I propose merely adding the missing parenthesis.

In some other context that would be fine, but here (in the context of what I'll call ordinary filters), it's a bit misleading as it suggests parens MUST be provided, or at least it doesn't make explicit that they're NOT required.

In summary, it seems to me that this section is (and ought for the most part to remain) focused on ordinary filters, which is fine, as the topic of assignment expressions is dealt with elsewhere. Perhaps the point about parens and pipes would fit in more easily there, but it could obviously be done here as well.

pkoppstein avatar Aug 20 '23 04:08 pkoppstein

the term "filter" is referring to filters in the ordinary sense

sed is a "filter" in the ordinary sense, but it can alter what it filters. Same thing with jq assignment expressions: they are filters.

it suggests parens MUST be provided

I don't think so, but also no harm comes from users adding more parenthesis than strictly necessary.

My argument for not making this change would be that it is too early in the manual to hint at this issue, and that it might be confusing. But one could always wordsmith it like this:

Note that .a.b.c is the same as .a | .b | .c (or more exactly, the same as (.a | .b | .c), but mostly there is no need for those parenthesis).

which has the same effect but also tells the user to never mind the reason just yet.

nicowilliams avatar Aug 20 '23 15:08 nicowilliams

We say 3 * 3 is the same as 3 + 3 + 3, but in some context we need to add parentheses, for example, in the context of 3 * 3 / 3.

I think it's more natural to add a statement in the "Assignments" doc section about the precedence of pipe | and assignment =. .a | .b | .c = 1 is the same as .a | .b | (.c = 1). To implement .a.b.c = 1 you need to write (.a | .b | .c) = 1. When we introduce "Complex assignments" we already added such parentheses:

(.posts[] | select(.author == "stedolan") | .comments) |= . + ["terrible."]

allxiao avatar Dec 15 '23 06:12 allxiao