jq
jq copied to clipboard
Add an option to treat empty input as null input
Currently, if you provide an empty input, jq appears to not run your filter program at all, which can be very puzzling (see #1497 and #1142, for example).
There are times when it would be more useful to treat the empty input as a null value, so that your filter program actually gets a chance to run and you can therefore actually do something with the input (or lack thereof), react to the lack of input, and actually produce an output anyway — which currently isn't possible at all when the input is "empty".
Being able to produce an output value even when the input is empty means that you could even influence the exit code when using --exit-status (see https://github.com/stedolan/jq/issues/1497#issuecomment-372867248) — also something that you otherwise couldn't do (currently always exits with 0 but is supposed to exit with 4 (#1497)).
I propose adding a new --empty-input-as-null option to complement the familiar existing input options like --slurp and --null-input/-n. In fact, it would work identically to --null-input when the input was empty.
jq usually treats its input a stream of 0 or more JSON values (separated by whitespace) ... which is great when you do in fact have multiple values coming in. But oftentimes you'll have an input script that generates a single, well-defined JSON value, and it would be more useful and intuitive to treat jq's input as a single value as well in order to process that script's single output value.
This option basically gives you that: It allows users to treat the input as if it were a single JSON value, falling back to null if needed instead of skipping running your filter entirely and not giving you a chance to generate an output value.
Examples:
> command_that_may_produce_empty_input_or_json | jq --empty-input-as-null '.'
null
# same as:
> jq -n '.'
null
This would be especially useful in combination with --exit-status.
> echo | jq --empty-input-as-null --exit-status '.'; echo $?
null
1
# same as:
> jq -n --exit-status '.'; echo $?
null
1
jq <settings -e --empty-input-as-null ".some_setting == true" || do_something_else
Could use with // operator to provide your own fallback behavior/value if you want something other than null when there's empty input:
> echo | jq --empty-input-as-null --exit-status '. // true'; echo $?
true
0
# same as:
> jq -n --exit-status '. // true'; echo $?
true
0
Just encountered this with an empty file. If file is empty, filter is not processed at all:
touch file.json
jq '(. // {}) * {"new_key": "value"}' file.json
Is there any workaround?
bump
@e1senh0rn asked:
is there any workaround?
What is the alternative behavior you have in mind? Would the -s option be useful for you? Or are you asking for new file-handing functions? Please be specific.
@pkoppstein maybe the report wasn't the most accurate but it seems to me that it still refers to the original (and quite detailed) report here.
I've come across this counter intuitive behaviour myself while using jq to parse all sorts of output from very common tools like gcloud or aws cli and the fact you can't use -e reliability because of this issue is a real pain and requires additional code and workarounds to cope with.
you can't use -e reliability because of this issue is a real pain and requires additional code and workarounds to cope with.
Exactly this, jq is often used to parse output from APIs and if the API returns no output (e.g. curl returned a 403 with no body due to missing token, or 503 due to service error) than jq proceeds happily along and we're none the wiser.
I think by far and wide, the user's expected behavior for -e is "exit on any false|null output" not "exit on any false|null output only if non-empty input". In fact, the docs don't even mention the non-empty quirk which makes it all the more confusing.
Just here to say this bit me as well, specifically when using jq to parse the output of an aws command that started failing. I get the purity argument that an empty string is valid JSON, but from a practicality perspective this sure would be a useful option.
Let me rephrase the original issue; maybe it will make it more clear.
I have the following json input:
{
"foo": true
}
and I want to use jq from a shell script, checking if foo == true and printing "fail" otherwise.
I read docs and come up with the following code (for those unfamiliar with shell, a || b checks the exit code of a and runs b if it is non-zero):
# valid input, foo is true
$ echo '{ "foo": true }' | jq -e '.foo == true' >/dev/null || echo "fail"
# valid input, foo is false
$ echo '{ "foo": false }' | jq -e '.foo == true' >/dev/null || echo "fail"
fail
# invalid input
echo '{ "foo": hoot }' | jq -e '.foo == true' >/dev/null || echo "fail"
parse error: Invalid numeric literal at line 1, column 14
fail
But it does not work for the case of no input:
$ echo | jq -e '.foo == true' >/dev/null || echo "fail"
Obviously, foo is not true, it's not even there, but my code do not print "fail". :-1:
It seems that if -e is used, empty input should be treated like {}, otherwise we can't rely on exit code in case of empty input.
I just came across the same problem. Trying to parse the results a curl command, but when the server goes down, my curl command returns an empty string.
> curl -sS --fail localhost:3000/health | jq -re 'has("status") and .status != "RED" // false'
curl: (7) Failed to connect to localhost port 3000: Connection refused
> echo $?
0
In my jq instructions I'm explicitly trying to handle the bad state by always returning false, but when an empty string is passed through my error case isn't being run at all.
I'd love to see an option that forces the rules to be run with an empty string instead of bypassing them altogether.
Maybe you can do something like this:
$ echo -n '{"status": "GREEN"}' | jq -esRr 'if . == "" then null else fromjson end | has("status") and .status != "RED" // false' ; echo $?
true
0
$ echo -n '{"status": "RED"}' | jq -esRr 'if . == "" then null else fromjson end | has("status") and .status != "RED" // false' ; echo $?
false
1
$ echo -n '' | jq -esRr 'if . == "" then null else fromjson end | has("status") and .status != "RED" // false' ; echo $?
false
1
That works perfectly, thank you! I didn't realize using the -sR options will cause the empty string to be passed through and then run the rules, since the rules aren't run in the other cases. So I think those two options along with the first rule you have solves this perfectly.
👍 Yeap without -sR jq will run the filter on each input JSON it reads, ex:
$ echo '' | jq .
$ echo '123' | jq .
123
$ echo '123 123' | jq .
123
123
Which i think make sense but might be a bit surprising
you can also just use:
jq -n 'try input catch null, inputs | . # your code'
or, if you want to be better about dealing with parse errors for the first input:
jq -n 'try input catch if . != "break" then error else null end, inputs | . # your code'
Yeap maybe a bit clearer than use raw slurp :)
Can also do:
jq -n 'inputs // null | ...'
Can also do:
jq -n 'inputs // null | ...'
RE: @wader
Nope, you can't do that. jq's // checks non-truthyness not emptiness; it's more similar to lua/python/javascript's or/or/|| than to perl's //. Annoying, but that is just how it is sadly; it would simplify many things if it actually only worked for empty.
inputs | . // null will replace all your false inputs with null (the only non-thruty values in jq are: false, null and empty); (also, inputs//null seems a little buggy and removes anything non-thruty as if you were doing inputs | select(.))
$ jq -n 'true, {}, null, 0, false, 10' | jq -n 'try input catch if . != "break" then error else null end, inputs'
true
{}
null
0
false
10
$ jq -n 'true, {}, null, 0, false, 10' | jq -n 'inputs | . // null'
true
{}
null
0
null
10
$ jq -n 'true, {}, null, 0, false, 10' | jq -n 'inputs // null'
true
{}
0
10
Ah yeah good catch, i've been bitten by that before. Yes also wish jq had a something like the // operator but only check for emptyness. Thers is isempty but i think it will be hard to use with inputs as it will consume the inputs.
What about:
jq -n '[inputs][0] | ...'
😄
Two more:
jq -n 'reduce inputs as $i (null; $i) | ...' # null or last input
jq -n 'first(inputs, null) | ...'
Ok time to do something more useful maybe :)
RE: @wader
jq -n '[inputs][0] | ...'
I am guessing you forgot a .[1:][]:
jq -n '[ inputs ] | .[0], .[1:][]'
If the first thing you do is [ inputs ], you can just use -s:
jq -s '.[0], .[1:][]'
Yep, that's nice and short :D
I am not a big fan of it since I don't like unnecessary slurps which make jq read and load all inputs into memory before it can do anything, but that is a nice looking solution.
I think the ideal solution is:
jq -n 'try input catch if . != "break" then error else null end, inputs | . # your code'
run input/0 which reads one input and, unlike inputs/0, errors with "break" if there are no inputs to read; if you catch an error from input/0 and that error is "break", output null (otherwise reproduce the error); after reading the first input, just call inputs/0 that will read the rest of the inputs.
I am guessing you forgot a
.[1:][]:jq -n '[ inputs ] | .[0], .[1:][]'
Ah yes i was thinking about the case when you only want the first input or null.
Yep, that's nice and short :D
Oh forgot about slurp, so i guess if you only care about first value it can be:
jq -s '.[0]'
# or
jq -s first
:)
I am not a big fan of it since I don't like unnecessary slurps which make
jqread and load all inputs into memory before it can do anything, but that is a nice looking solution.I think the ideal solution is:
jq -n 'try input catch if . != "break" then error else null end, inputs | . # your code'
input(unlikeinputs) reads one input and errors with "break" if there are no inputs to read. if you catch an error and it's "break", returnnull(otherwise forward the error), after reading the first input, just callinputsand read the rest of the inputs until there is are no more or an error occurs without checking anything.
Yes also like when jq uses generators over arrays to make things like that possible.
Yes also like when jq uses generators over arrays to make things like that possible.
input/0 is great! I wish it were possible to use it and inputs/0 to iterate any jq expression, not just inputs from the input files. Sometimes I think of splitting my jq scripts into two command (jq '... | .[]' | jq -n '... | input | ...') just so that I can use it. (it's especially convenient when it's something like: jq '... | tostream' | jq -n '... | input | ...')
Here is a neat tool that I made that makes heavy usage of input/0: https://gist.github.com/emanuele6/b2f6055a5ac2cca4618f467d84f739fd
It "partitions" arrays read from stdin into multiple arrays of n elements (n is the argument you pass to the script):
$ ./partitioner.jq 2 <<< '[1,2,3] [4,5,6,7,8]'
[1,2]
[3,4]
[5,6]
[7,8]
$ ./partitioner.jq 5 <<< '[1,2,3] [4,5,6,7,8]'
[1,2,3,4,5]
[6,7,8]
input/0is great! I wish it were possible to use it andinputs/0to iterate anyjqexpression, not just inputs from the input files. Sometimes I think of splitting myjqscripts into two command (jq '... | .[]' | jq -n '... | input | ...') just so that I can use it. (it's especially convenient when it's something like:jq '... | tostream' | jq -n '... | input | ...')
Idea is to use the output of something later on in a filter pipeline? would a binding work instead? or maybe this would be more similar to coroutines https://github.com/stedolan/jq/issues/1342?
Here is a neat tool that I made that makes heavy usage of
input/0: https://gist.github.com/emanuele6/b2f6055a5ac2cca4618f467d84f739fdIt "partitions" arrays read from
stdininto multiple arrays of n elements (n is the argument you pass to the script):$ ./partitioner.jq 2 <<< '[1,2,3] [4,5,6,7,8]' [1,2] [3,4] [5,6] [7,8] $ ./partitioner.jq 5 <<< '[1,2,3] [4,5,6,7,8]' [1,2,3,4,5] [6,7,8]
Nice! will have a look. For fq i added a chunk($size) function to do something similar as i need it quite a lot, but it only works on one input array.
In this post I would like to emphasize that to "branch" on whether
the input stream is empty or not, without losing the first item if
any, it is not necessary to use the -s command-line option or
[inputs], both of which may be undesirable if the input stream might
be very large.
The simplest, efficient, general-purpose way to distinguish between an empty and a non-empty input stream is to use the template:
jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'
That is, you would replace "empty" by the program that you want to handle the case of an empty input
stream, and place the main program (say P), right after inputs, like so:
jq -n '
def P: .;
[try input catch infinite]
| .[0]
| if isinfinite then "empty" else ., inputs | P end'
Enjoy!
Re: partitioner
Note that jq has an (undocumented but internally used) builtin, _nwise/1, that partitions an array input into arrays of up to the specified length. Its can evidently be used with add to concatenate and then partition arrays. To process a stream, s, of arrays in this manner, one could write:
# input and output are both streams of arrays
def repartition(s; $n): [s] | add | _nwise($n);
RE: @pkoppstein
The simplest, efficient, general-purpose way to distinguish between an empty and a non-empty input stream is to use the template:
jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'
that doesn't handle parse errors for the first input correctly, but i don't think I understand how that is more simple/efficent/general-purpuse than the solution I mentioned:
jq -n 'try input catch if . != "break" then error else "empty" end, inputs'
Note that jq has an (undocumented but internally used) builtin,
_nwise/1, that partitions an array input into arrays of up to the specified length.
I know _nwise/1 exists, but that does not do the same thing as my script; that has to slurp the arrays and can only output when it has finished reading all the inputs; mine doesn't have to slurp, it can process the array inputs as they come.
$ # i am typing [1,2,3,4,5] and [2] and then pressing ^D
$ jq -n 'def repartition(s; $n): [s] | add | _nwise($n); repartition(inputs; 2)'
[1,2,3,4,5]
[2]
^D
[1,2]
[3,4]
[5,2]
$ ./partitioner.jq 2
[1,2,3,4,5]
[1,2]
[3,4]
[2]
[5,2]
^D
RE: @wader
Idea is to use the output of something later on in a filter pipeline?
This is probably getting a little OT, but the idea is to be able to easily read an arbitrary number of values at each iteration without having to use reduce and only being able to output at the end, or having to use foreach and [[state],actual_value_to_output]:
example: split an array into multiple arrays at every null.
$ # with input/0
$ jq -n '[1,2,3,null,4,6,false,5,null,12,3,4,5,7,7,6][]' | jq -cn 'try repeat(1 | [ while(. != null; ([ input ]? // error)[]) ]) catch . | arrays[1:]'
[1,2,3]
[4,6,false,5]
[12,3,4,5,7,7,6]
$ # with foreach
$ # you lose values at the end if the array is not null terminated:
$ jq -cn '[1,2,3,null,4,6,false,5,null,12,3,4,5,7,7,6] | foreach .[] as $v ([ false, [] ]; if .[0] then [false,[]] else . end | if $v != null then .[1] += [ $v ] else .[0] = true end; select(.[0])[1] | arrays)'
[1,2,3]
[4,6,false,5]
$ jq -cn '[1,2,3,null,4,6,false,5,null,12,3,4,5,7,7,6,null] | foreach .[] as $v ([ false, [] ]; if .[0] then [false,[]] else . end | if $v != null then .[1] += [ $v ] else .[0] = true end; select(.[0])[1] | arrays)'
[1,2,3]
[4,6,false,5]
[12,3,4,5,7,7,6]
$ # with reduce; can only output at the end
$ jq -cn '[1,2,3,null,4,6,false,5,null,12,3,4,5,7,7,6] | reduce .[] as $v ([[]]; if $v == null then . + [ [] ] else .[-1] += [ $v ] end) | .[]'
[1,2,3]
[4,6,false,5]
[12,3,4,5,7,7,6]
This is a simple example, but, when you need to take an arbitrary number of values at each iteration from an array or stream (something that you often want to do when reconstructing a value from tostream/0 output or --stream) and you want to output whenever possible instead of having to wait until you have iterated the whole array or stream, an input/0 loop is usually more convenient to use compared to foreach .[] as $v/foreach inputs as $v or reduce .[] as $v/reduce inputs as $v.
CONS:
reduce:- slurps
- must wait until the end of the stream before it can output
foreach:- can't output final incomplete value
input/0loop:- may make the code look too procedural
@emanuele6 - Please note that my two most recent posts above were not addressed to you because they were not intended as a comment on or critique of your contributions.
The first of these two posts (try input catch infinite) was motivated in part
by the fact that your response, while
addressing the OP's question, did not address the related question
about branching on whether the input stream is empty or not.
A case of apples and oranges if you like.
It's also a case of apples and oranges with respect to your partitioner.jq script and my repartition function. I did not mean to suggest that you did not know about _nwise or that my very simple program was equivalent to your much more elaborate one. I was just pointing out how the functionality illustrated in your original "partitions" post could be achieved using _nwise
By the way, for anyone interested in a simple jq program to repartition in an incremental manner (in particular, without concatenating the arrays), here's one solution [with correction since first posting]:
# s is assumed to be a stream of arrays
def repartition(s; $n):
foreach (s,null) as $a (null; # {emit, buffer}
if $a == null then {emit: .buffer}
elif $a == [] then .
else .buffer += $a
| (.buffer|length) as $len
| if $len >= $n
then ($len % $n) as $x
| {emit: .buffer[: $len - $x], buffer: (if $x>0 then .buffer[-$x:] else null end)}
else .emit = null
end
end;
select(.emit).emit | _nwise($n) );
Example: repartition([1,2],[3,4,5],[6,7]; 2)
RE: @pkoppstein
Please note that my two most recent posts above were not addressed to you because they were not intended as a comment on or critique of your contributions
I was just trying to figure out how it was different from mine; it looked like just mine, but with the extra step of using infinite and with incorrect handling of parse errors for the first input.
The first of these two posts (
try input catch infinite) was motivated in part by the fact that your response, while addressing the OP's question, did not address the related question about branching on whether the input stream is empty or not. A case of apples and oranges if you like.
jq -n 'try input catch if . != "break" then error else "empty" end, inputs'
If you want to branch on the case in which there are no inputs with mine, you just have to write code where I wrote "empty" and the parse errors for the first input were handled properly in mine, so I couldn't figure out why you did the extra steps. (I thought it about it for quite a bit, that is why I asked.)
$ printf '\n' | jq -n 'try input catch if . != "break" then error else "empty" end, inputs'
"empty"
$ printf '%s\n' '"hello"' '"hi"' | jq -n 'try input catch if . != "break" then error else "empty" end, inputs'
"hello"
"hi"
$ printf '%s\n' '"hello"' 'hi' | jq -n 'try input catch if . != "break" then error else "empty" end, inputs'
"hello"
jq: error (at <stdin>:2): Invalid numeric literal at line 3, column 0
$ printf '%s\n' 'hello' '"hi"' | jq -n 'try input catch if . != "break" then error else "empty" end, inputs'
jq: error (at <stdin>:1): Invalid numeric literal at line 2, column 0
$
$ printf '\n' | jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'
"empty"
$ printf '%s\n' '"hello"' '"hi"' | jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'
"hello"
"hi"
$ printf '%s\n' '"hello"' 'hi' | jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'
"hello"
jq: error (at <stdin>:2): Invalid numeric literal at line 3, column 0
$ printf '%s\n' 'hello' '"hi"' | jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'
"empty"
By the way, for anyone interested in a simple jq program to repartition in an incremental manner (in particular, without concatenating the arrays), here's one solution:
# s is assumed to be a stream of arrays def repartition(s; $n): foreach (s,null) as $a (null; # {emit, buffer} if $a == null then {emit: .buffer} else .buffer += $a | if (.buffer|length) >= $n then {emit: .buffer[:$n], buffer: .buffer[$n:]} else .emit = null end end; select(.emit|length>0).emit );
That could be a possible solution, but it should be allowed to emit more than one value per iteration otherwise it will output arrays a lot after they were input if it gets arrays larger than $n (that should be the usual case), and, more importantly, it can build up a huge buffer (if there are not enough arrays with length < $n to balance the larger ones) that will be output entirely at the end.
for example:
$ # i am entering [1,2,3,4,5,6,7], ["a","b","c","d","e"] and ^D
$ jq -cn '# s is assumed to be a stream of arrays
def repartition(s; $n):
foreach (s,null) as $a (null; # {emit, buffer}
if $a == null then {emit: .buffer}
else .buffer += $a
| if (.buffer|length) >= $n
then {emit: .buffer[:$n], buffer: .buffer[$n:]}
else .emit = null
end
end;
select(.emit|length>0).emit );
repartition(inputs; 3)'
[1,2,3,4,5,6,7]
[1,2,3]
["a","b","c","d","e"]
[4,5,6]
^D
[7,"a","b","c","d","e"]
possible fix:
# s is assumed to be a stream of arrays
def repartition(s; $n):
foreach (s,null) as $a ({emit: []};
if $a == null then {emit: [ .buffer // empty ]}
else .buffer += $a
| if (.buffer|length) < $n then .emit = []
else [ .buffer | _nwise($n) ]
| if (.[-1]|length) == $n
then {emit: .}
else {emit: .[:-1], buffer: .[-1]}
end
end
end;
.emit[]);
@emanuele6 - To see the difference, suppose we want to branch on whether the input stream is empty. If jq had a side-effect-free version of isempty/1, we would write something like:
if sideffect_free_isempty(inputs) then X else E end. # not currently possible
Using the 'infinite' template, we have only to write:
jq -n '[try input catch infinite] | .[0] | if isinfinite then X else ., inputs | E end'
or, taking into account your concern:
jq -n '[try input catch if . == "break" then infinite else error end] | .[0] | if isinfinite then X else ., inputs | E end'
Using your approach, we would have:
jq -n 'try input catch if . != "break" then error else X end, inputs | P'
So the difference is now obvious: with your program, an empty input stream results in X|P rather than just P.
I've fixed the incremental version of repartition. Thanks.
Oh, right. I didn't think of that for some reason! Thank you.