jq icon indicating copy to clipboard operation
jq copied to clipboard

Feature request: Add a new, more sensible alternation operator

Open olorin37 opened this issue 4 years ago • 66 comments

JQ should distinguish between "falsy" and "nully" values. Alternative operator should replace only nully variables, but not falsy ones!

empty // null // false // 5
  # is ==> 5
  # should be ==> false

Current // behavior makes it useless when data contains also booleans, because in such case all falsees will be replaced by alternative value, if I needed such behavior, I can simply use if . then "yes" else "no" end, now instead alternative operator I need to use if . != null then . else "default" end.

olorin37 avatar Jan 08 '20 07:01 olorin37

Unfortunately, this is a defined behavior of the operator in the manual (https://stedolan.github.io/jq/manual/#Alternativeoperator%3A%2F%2F). I agree there's value to being able to differentiate between false and null with //, but there's almost certainly code in the wild now that depends on // matching false.

We could consider a variant of that operator that ignores false (maybe !//), but I'm not sure... @nicowilliams what say you?

wtlangford avatar Jan 08 '20 14:01 wtlangford

why not ?? then?

leonid-s-usov avatar Jan 08 '20 16:01 leonid-s-usov

I can kind of relate to the OP problem, as even the manual gives a misleading example:

This is useful for providing defaults: .foo // 1 will evaluate to 1 if there’s no .foo element in the input.

This example doesn't clarify that in the case of "foo": false, it will still evaluate to 1

leonid-s-usov avatar Jan 08 '20 16:01 leonid-s-usov

I agree, surely some one has code which relies on that behavior but, in my opinion actual behavior is even opposite to the manual's description "This is useful for providing defaults" (in second paragraph). But the first one specifies it clearly:

A filter of the form a // b produces the same results as a, if a produces results other than false and null. Otherwise, a // b produces the same results as b.

So, changing it would require to change that specification. The best option is to introduce new operator. ?? looks nice and sensible. As an alternative could be also ?: or !!.

(PS. Pity that // is lost, it would correspond with perl6/raku operator).

olorin37 avatar Jan 08 '20 22:01 olorin37

Maybe /// or ??, yes, to provide an alternation for empty left-hand sides.

nicowilliams avatar Jan 09 '20 17:01 nicowilliams

As discussed many other times, the behaviour of // will not change; we could consider adding an alternate operator that work as you would expect, though there are many possible interpretation for that operator i.e.

# lhs newop1 rhs
#  Run rhs for each non-truthy element returned by lhs instead of
#   returning that non-truthy value
def _newop1($lhs; rhs): if $lhs then $lhs else rhs end;

# lhs newop2 rhs
#  Run rhs for each null element returned by lhs instead of returning
#   that null
def _newop2($lhs; rhs): if $lhs == null then $lhs else rhs end;

# lhs newop3 rhs
#  Run rhs only if lhs returns nothing
def _newop3(lhs; rhs);
  . as $dot |
  foreach ((lhs | [.]), null) as $l (false;
    if $l then true end;
    if $l then $l[] else select(not) | $dot | rhs end);

~~In any case, this issue is about changing the behaviour of // which will not happen... so closing.~~

cc: @nicowilliams


n.b.: the behaviour of newop1 is not the current behavoiur of lhs // rhs. lhs // rhs returns only the elements of lhs that are truthy, and runs rhs if lhs does not return any truthy values; effectively equivalent to the following function:

def _definedor(lhs; rhs):
  . as $dot |
  foreach ((lhs | select(.) | [.]), null) as $l (false;
    if $l then true end;
    if $l then $l[] else select(not) | $dot | rhs end);

emanuele6 avatar Sep 28 '23 02:09 emanuele6

Let's reuse this issue for the feature request, I guess.


  1. Do we really want this new operator?
  2. What behaviour should it have exactly?
  3. What symbol/syntax should it use?
  4. Should it be a builtin instead?

My preference would be the 3rd behaviour I mentioned in the previous comment (lhs op rhs that runs rhs only if lhs returns nothing), but the 2nd behaviour (run rhs for each element returned by lhs that is null) more closely resembles what was requested by the OP, and could also be nice for simple jq command lines since we don't yet have the proposed .foo!? syntax. (effectively shorthand for select(has("foo")).foo?.)

I don't have a suggestion for the symbol to use for this new operator at the moment.

It would be more convenient if it were a operator, and not a builtin.

emanuele6 avatar Sep 28 '23 02:09 emanuele6

  1. Maybe. I've wanted it at times.
  2. There's several options. The one I've wanted is "all the values of the LHS or, if the LHS is empty, all the values of the RHS".
  3. ///?
  4. That would work for me.

We could jq-code this function with foreach, however until we fix the reference taking issues in foreach it might be best if we bytecode it in a way very similar to //, only w/o the equivalent of select((.!=null) and (.!=false)). We could call it ifempty/2 or alt/2 or some such name.

nicowilliams avatar Sep 28 '23 02:09 nicowilliams

I think that adding /// will add more confusion to the existing //. I'd prefer a ? based syntax:

  • ?: <- my favorite
  • ??

leonid-s-usov avatar Sep 28 '23 11:09 leonid-s-usov

  1. Operator for providing default value in case selected one is lacking is helpful and // often is used instead such which actually leads to errors, and in may opinion should not be used.
  2. lhs ?? rhs should return rhs only if lhs == null. (On the side: I understand empty is not actual value but the way to indicate lack of the element, so I am not sure if this is possible or feasible to cover also empty value. It requires deeper analysis - and might be a mistake).
  3. ?? or possibly better choice is ?:, as this is widely known as Elvis operator (Originally coming from C/C++ where second argument of trenary operator is optional and x ? : y means the same as x ? x : y. Many languages has this syntax also implemented as separate operator ?:. So ?: will be easily recognizable - which is the benefit. But still ?? also hes its benefits - as it is easier to write :-) as it is double character.
  4. I am not sure what an alternative is? But my intention was that jq language has such operator (which could be recommended instead of //).

olorin37 avatar Sep 28 '23 12:09 olorin37

I think ?? appearance-wise fit well with existing //. Other options ?= or || (😬)

wader avatar Sep 28 '23 15:09 wader

As @olorin37 mentioned, ?: has already been used for exactly this purpose in other languages, which IMO is a serious argument for using that syntax. I like ?? though, too.

leonid-s-usov avatar Sep 28 '23 15:09 leonid-s-usov

One caveat about ?? is that that ?? is valid in jq today as a postfix operator, but ?? would now be an infix operator, thus breaking any code that was using ?? -- not a very likely case, but hey.

When @wtlangford added destructuring he accidentally changed the meaning of ?// and the workaround is to use ? //. I noticed this recently and was surprised, then amused, but I suspect no one else has noticed (I don't recall seeing an issue about this).

nicowilliams avatar Sep 28 '23 16:09 nicowilliams

  1. Operator for providing default value in case selected one is lacking is helpful and // often is used instead such which actually leads to errors, and in may opinion should not be used.

    1. lhs ?? rhs should return rhs only if lhs == null. (On the side: I understand empty is not actual value but the way to indicate lack of the element, so I am not sure if this is possible or feasible to cover also empty value. It requires deeper analysis - and might be a mistake).

Why is null special? If you're looking for "some key in some object is not set" then .foo? will produce empty and you'd want something like .foo? NEW_OPERATOR_HERE $alternative. If that operator is ?? then we'd have... .foo? ?? $alternative -- a bit strange looking.

3. `??` or possibly better choice is `?:`, as this is widely known as Elvis operator (Originally coming from C/C++ where second argument of trenary operator is optional and `x ? : y` means the same as `x ? x : y`. Many languages has this syntax also implemented as separate operator `?:`. So `?:` will be easily recognizable - which is the benefit. But still `??` also hes its benefits - as it is easier to write :-) as it is double character.

But the conditional in the ternary operator would be a conditional, so once again we're looking at false vs. truthy, and what about empty?

Also, we can't introduce a traditional ternary operator because we're already using ? and : in a way that makes that impossible without breaking backwards compatibility. Also, any syntax that involves : is likely to cause ambiguities with : used in object syntax.

Also, jq syntax is so pithy already that using a builtin function instead of an operator for the ternary operator would be better, IMO. One can def cond(c; t; e): if c then t else e end; right now, and it's not that much more verbose than the traditional ternary operator.

nicowilliams avatar Sep 28 '23 16:09 nicowilliams

I think that adding /// will add more confusion to the existing //. I'd prefer a ? based syntax:

* `?:` <- my favorite

Works for me. But now, what should its semantics be?

My preference is that the semantics be that the RHS is run IFF the LHS produces zero outputs (i.e., it's empty).

Anyone who wants to have the alternation be about specific values can use if, and anyone who wants the alternation to be a bit more like // but for specific values can just (lhs | select(condition)) ?: rhs.

nicowilliams avatar Sep 28 '23 16:09 nicowilliams

I think ?? appearance-wise fit well with existing //. Other options ?= or || (😬)

= cannot be used, because =is used to create assignment operators like |=.

olorin37 avatar Sep 28 '23 16:09 olorin37

One caveat about ?? is that that ?? is valid in jq today as a postfix operator, but ?? would now be an infix operator, thus breaking any code that was using ?? -- not a very likely case, but hey.

So, it means ?: wins, and it should be used.

olorin37 avatar Sep 28 '23 16:09 olorin37

@nicowilliams ?= is NOT an option. foo?=bar is valid jq.

@olorin37 The operator we are proposing to add, does not do what you want. I have listed a bunch of possible behaviour, and you have not specified exactly what behaviour you want. With the third behaviour, null newop3 1 returns null, not 1 (you can test the jq implementation I provided _newop3(null; 1)). It only returns 1 if lhs returns nothing (e.g. empty). Are you ok with that?

I don't like that only syntaxes with ? are being proposed. I would prefer ///, or something else without ?.

emanuele6 avatar Sep 28 '23 17:09 emanuele6

@nicowilliams ?= is NOT an option. foo?=bar is valid jq.

I agree, but I think you meant to address that to @wader :)

nicowilliams avatar Sep 28 '23 18:09 nicowilliams

I don't like that only syntaxes with ? are being proposed. I would prefer ///, or something else without ?.

I'm happy with ///, and I'm happy with ?:, and I think the first is probably better than the second.

nicowilliams avatar Sep 28 '23 18:09 nicowilliams

@olorin37 The operator we are proposing to add, does not do what you want.

Right I noticed, this is why I did not select any of them, but showed my understanding.

I was not aware or forgot abut suffix operator: .x?. I want to have this behaviour buildin in jq: for maybe_nothing, which in case it is null or empty; and default_value; and our operator op; the exprssion: maybe_nothing op default_value should equals to default_value.

And I can agree with this maybe_nothing? /// defalut_value for syntax clarity. My consideration was that it seems a little to vebose, so I wanted the question mark could be skipped hear - but it coud be wrong idea in regard of keeping syntax consistent.

As there are some objective technical arguments against all operators with ? or : then dont use them, but here I have another proposition. The operator could be textual: orelse for example? Already there are some textual operators, so maybe it would be possible? (It comes from erlang).

olorin37 avatar Sep 28 '23 18:09 olorin37

RE: @olorin37

  1. lhs ?? rhs should return rhs only if lhs == null. (On the side: I understand empty is not actual value but the way to indicate lack of the element, so I am not sure if this is possible or feasible to cover also empty value. It requires deeper analysis - and might be a mistake).

Empty is not really a value, it is a function that returns no values; expressions in jq are generators that return multiple values, e.g. 1 returns a 1; 1,2 returns 1, and then 2; and range(4) returns 0,1,2,3, etc. then operators may either implicitly loop the results of the generator, or use the generator as a "lambda" and implement something more complex with it.

With the second operator I proposed _newop2(1,false,null,3,null ; 4,5) will return:

1
false
4
5
3
4
5

I.e. loop the values returned by lhs, return the value if it is not null, otherwise evaluate the rhs expression and return its values. (rhs gets re-evaluated for each null).

With this implementation empty newop2 "foo" or [] | .[] newop2 "foo" will not return anything, because it loops the results of .[]/empty that do not return anything.

If you want an operator like newop2, but that also runs rhs if lhs returns nothing, you need something like:

# lhs newop4 rhs
#  Run rhs for each null element returned by lhs instead of returning
#   that null; Also run rhs if lhs returns nothing.
def _newop4(lhs; rhs):
  . as $dot |
  foreach ((lhs | [.]), null) as $l (false;
    if $l then true end;
    if if $l then $l[0] == null else not end
      then $dot | rhs
      else $l | arrays[]
    end);

emanuele6 avatar Sep 28 '23 18:09 emanuele6

@nicowilliams ?= is NOT an option. foo?=bar is valid jq.

I agree, but I think you meant to address that to @wader :)

@nicowilliams oops :smile:

emanuele6 avatar Sep 28 '23 18:09 emanuele6

@olorin37 .foo? is a shorthand for (try .foo)/(try .foo catch empty), it returns .foo if the input is an object, or null, but instead of throwing an error if the input is another type, it returns nothing; so that does not really want what you want either.

The syntax I mentioned earlier is .foo!? which does not exist in jq yet (it is proposed in #859). Logically it is the combination of .foo! and the ? operator. .foo currently retuns null if the "foo" key is absent for the input object, or if the input is null instead of an object. .foo! would throw an error instead in those cases. So then ? wraps it in a try ... catch empty, and makes it return nothing instead of an error in that case.

Still, if you have something that returns null that is not .foo .[10] with an absent key, and you want to make it run the rhs, !? will not help you. (e.g. .foo where "foo" is present and is null).

You would have to use |values (short for |select(. != null)) on the lhs to make newop3 work more similarly to what you want as @nicowilliams mentioned earlier:

Anyone who wants to have the alternation be about specific values can use if, and anyone who wants the alternation to be a bit more like // but for specific values can just (lhs | select(condition)) ?: rhs.

I.e. (.foo | values) newop3 "baz"

emanuele6 avatar Sep 28 '23 18:09 emanuele6

You righ with empty this is a lack of value. Sorry I do not have right now jq under my fingers to experiment and check thing... So i can be wrong with thise empty and ? in previous comment.

The clue, the most important for the operator is to provide default value if left hand side is null. Empty can give nothing. I need to check those ideas with jq.

olorin37 avatar Sep 28 '23 18:09 olorin37

For the discussed operator, I don't think that emptiness of lhs should trigger the rhs. Doing so will overload the operator.

In my mind, the purpose of this operator is to provide a default value in place of a null. We should not provide a default value in place of nothing, because nothing should cause a backtrack in the jq world, unwinding the stack until another value (or null) can be extracted at the nearest generation point.

leonid-s-usov avatar Sep 29 '23 11:09 leonid-s-usov

In my mind, the purpose of this operator is to provide a default value in place of a null.

I fully agree actually. I was confused by something in prevoius examples. But operator is ment to just replace null value, and nothing (which means false or 0 should not be replaced).

olorin37 avatar Sep 29 '23 11:09 olorin37

// is a stateful sort of filter, producing all the values of the RHS IFF the LHS produces no satisfactory values.

The problem with // is in what makes LHS values satisfactory -- not-false-and-not-null is unsatisfactory. But any such filter, if just hard-coded, will be unsatisfactory. You want not-null, and someone else wants not-false and so on.

A version of // that only executes the RHS if the LHS is empty lets you write whatever filter you want for the LHS values. We already have if and select/1 for filtering values. What's hard to jq-code is "RHS IFF LHS is empty", but given "RHS IFF LHS is empty" you can stick a select(whatever) in the LHS.

Generality demands we not filter specific values here.

nicowilliams avatar Sep 29 '23 14:09 nicowilliams

You are technically correct. However, this doesn't help in practice, IMO.

This thread originally mentioned // because it was the closest thing to having a default value for a null, which is helpful because .foo will return null for a missing property rather than empty.

And yes, null is a special value. Much more so than false.

Suggesting a

select(. != null) /// "default"

seems to be the opposite of what is wanted: brevity.

We should probably separately discuss two things:

  • a handy if . == null then "default" else . end kind of operator, presumably ?:
  • a pure rhs IFF lhs is empty kind of thing that would NOT implicitly filter null and false from the rhs as // does. That could be /// though I'm not very happy with it cause it will cause eternal confusion with // in terms of who does what.

leonid-s-usov avatar Sep 29 '23 16:09 leonid-s-usov

You are technically correct. However, this doesn't help in practice, IMO.

This thread originally mentioned // because it was the closest thing to having a default value for a null, which is helpful because .foo will return null for a missing property rather than empty.

There's what OP asked for, and there's what we should do (if anything). They need not be the same thing.

And yes, null is a special value. Much more so than false.

It's not really special. If "a" is set in . and it's set to the value null, then .a will be null, but if "a" is not set in . then .a will also be null, which is why null is "special", but it's not that special.

What I want for the .a case is to add a postfix operator ! that causes .a! to error if "a" is not set in .. One could then write .a!? to get empty if "a" is not set in . or else to get whatever "a" is set to in . -- this would make null much less special.

Basically, I don't like that null is special -- it shouldn't be. And so I don't want a /// to treat null as special. It would be very dissatisfying to have /// treat null as special.

Suggesting a

select(. != null) /// "default"

seems to be the opposite of what is wanted: brevity.

I understand. jq is already plenty pithy though.

We should probably separately discuss two things:

* a handy `if . == null then "default" else . end` kind of operator, presumably `?:`

* a pure `rhs IFF lhs is empty` kind of thing that would NOT implicitly filter `null` and `false` from the rhs as `//` does. That could be `///` though I'm not very happy with it cause it will cause eternal confusion with `//` in terms of who does what.

/// and // would be very close in semantics, so it makes sense that they be close in form.

nicowilliams avatar Sep 29 '23 17:09 nicowilliams