jq `contains(element)` incorrectly returns `true` when object value doesn't match

`contains(element)` incorrectly returns `true` when object value doesn't match

Open Integralist opened this issue 2 years ago • 2 comments

Describe the bug I've noticed that contains(element), when given an object as an argument with a value that is of type string, will incorrectly return true when the JSON being parsed doesn't contain the value we're looking for.

To Reproduce

echo '{"foo": "bar", "baz": "qux", "beep": "stuff_in_front_of_boop"}' | jq 'contains({"foo":"bar", "beep":"boop"})'

Expected behavior In the given example above it should ideally report false because the value assigned to the beep key is not the same as what we're looking for. The value we're looking for is "boop" while the value in the JSON input is actually "stuff_in_front_of_boop" (please refer to the 'additional context' below as I have a suggestion in case this isn't a bug).

Environment (please complete the following information):

macOS 11.6
jq-1.6

Additional context If this isn't a bug, then I feel the documentation should maybe be rephrased so that it's less ambiguous as to the expected behaviour. What I mean by this is that the documentation states...

An object B is contained in object A if all of the values in B are contained in the value in A with the same key.

The highlighting for contained in is mine own. Now this could be interpreted as meaning "as long as part of the value is found in the input key's value, then we'll return true". But my initial interpretation was that the value had to be exact.

The actual outcome I was seeing would have been expected if the documentation had said something more like...

An object B is contained in object A if all of the values in B are found as a substring in the value in A with the same key.

Now I appreciate the wording isn't good in my example because the value could be a non-string type, hence why you've likely used "contained in", but then what made the string outcome more confusing was that the documented example uses an integer and the ambiguity doesn't exist there...

echo '{"foo": "bar", "baz": "qux", "beep": 123456}' | jq 'contains({"foo":"bar", "beep":123})'

Notice in the above example how an integer value does have to match exactly, where as with the string example I provided it doesn't have to match exactly.

Ultimately I guess just making the string situation more explicit in the documentation would have helped me avoid any confusion.

All that said, this is probably not worth your time worrying too much about so I won't mind if you just close this issue 🙂

Dec 09 '21 09:12 Integralist

The word contained implies that it includes any JSON types, including arrays, booleans and objects. Consider something like echo '{"x":[1,2,3]}' | jq 'contains({"x":[2]})'. Also, jq considers a string contains any of its substrings; echo '"something"' | jq 'contains("eth")' # true, which is natural isn't it. For numbers and booleans, contains function is defined by value equality, not by their strings. This is working as intended, hard to breaking compatibility anymore, but I personally agree this behavior is unintuitive.

Dec 09 '21 15:12 itchyny

Perhaps the simplest reproducer is:

jq -n '["aa"] | contains( ["a"] )'

My intuition also misled me. I understand the explanation given by @itchyny. I would expect such a behavior if the function name was contains_recursive. Under the name contains, and based on the current documentation, the implementation feels wrong.

Anyway: thank you for confirming that this is the intended behavior.

Jan 19 '22 22:01 borango

for #2305 , looks like this is an easy issue to close...

May 04 '23 14:05 trailstrider

jq jq copied to clipboard

`contains(element)` incorrectly returns `true` when object value doesn't match

jq
jq copied to clipboard