jq Add an option to treat empty input as null input

Currently, if you provide an empty input, jq appears to not run your filter program at all, which can be very puzzling (see #1497 and #1142, for example).

There are times when it would be more useful to treat the empty input as a null value, so that your filter program actually gets a chance to run and you can therefore actually do something with the input (or lack thereof), react to the lack of input, and actually produce an output anyway — which currently isn't possible at all when the input is "empty".

Being able to produce an output value even when the input is empty means that you could even influence the exit code when using --exit-status (see https://github.com/stedolan/jq/issues/1497#issuecomment-372867248) — also something that you otherwise couldn't do (currently always exits with 0 but is supposed to exit with 4 (#1497)).

I propose adding a new --empty-input-as-null option to complement the familiar existing input options like --slurp and --null-input/-n. In fact, it would work identically to --null-input when the input was empty.

jq usually treats its input a stream of 0 or more JSON values (separated by whitespace) ... which is great when you do in fact have multiple values coming in. But oftentimes you'll have an input script that generates a single, well-defined JSON value, and it would be more useful and intuitive to treat jq's input as a single value as well in order to process that script's single output value.

This option basically gives you that: It allows users to treat the input as if it were a single JSON value, falling back to null if needed instead of skipping running your filter entirely and not giving you a chance to generate an output value.

Examples:

> command_that_may_produce_empty_input_or_json | jq --empty-input-as-null '.'
null

# same as: 
> jq -n '.'
null

This would be especially useful in combination with --exit-status.

> echo | jq --empty-input-as-null --exit-status '.'; echo $?
null
1

# same as: 
> jq -n --exit-status '.'; echo $?
null
1

jq <settings -e --empty-input-as-null ".some_setting == true" || do_something_else

Could use with // operator to provide your own fallback behavior/value if you want something other than null when there's empty input:

> echo | jq --empty-input-as-null --exit-status '. // true'; echo $?
true
0

# same as:
> jq -n --exit-status '. // true'; echo $?
true
0

Mar 14 '18 01:03 TylerRick

Just encountered this with an empty file. If file is empty, filter is not processed at all:

touch file.json
jq '(. // {}) * {"new_key": "value"}' file.json

Is there any workaround?

Jul 26 '19 13:07 e1senh0rn

bump

Apr 01 '20 19:04 nocive

@e1senh0rn asked:

is there any workaround?

What is the alternative behavior you have in mind? Would the -s option be useful for you? Or are you asking for new file-handing functions? Please be specific.

Apr 01 '20 21:04 pkoppstein

@pkoppstein maybe the report wasn't the most accurate but it seems to me that it still refers to the original (and quite detailed) report here. I've come across this counter intuitive behaviour myself while using jq to parse all sorts of output from very common tools like gcloud or aws cli and the fact you can't use -e reliability because of this issue is a real pain and requires additional code and workarounds to cope with.

Apr 11 '20 21:04 nocive

you can't use -e reliability because of this issue is a real pain and requires additional code and workarounds to cope with.

Exactly this, jq is often used to parse output from APIs and if the API returns no output (e.g. curl returned a 403 with no body due to missing token, or 503 due to service error) than jq proceeds happily along and we're none the wiser.

I think by far and wide, the user's expected behavior for -e is "exit on any false|null output" not "exit on any false|null output only if non-empty input". In fact, the docs don't even mention the non-empty quirk which makes it all the more confusing.

Apr 27 '20 17:04 stewartadam

Just here to say this bit me as well, specifically when using jq to parse the output of an aws command that started failing. I get the purity argument that an empty string is valid JSON, but from a practicality perspective this sure would be a useful option.

Sep 16 '20 23:09 nicpottier

Let me rephrase the original issue; maybe it will make it more clear.

I have the following json input:

{
  "foo": true
}

and I want to use jq from a shell script, checking if foo == true and printing "fail" otherwise.

I read docs and come up with the following code (for those unfamiliar with shell, a || b checks the exit code of a and runs b if it is non-zero):

# valid input, foo is true
$ echo '{ "foo": true }' | jq -e '.foo == true' >/dev/null || echo "fail"

# valid input, foo is false
$ echo '{ "foo": false }' | jq -e '.foo == true' >/dev/null || echo "fail"
fail

# invalid input
echo '{ "foo": hoot }' | jq -e '.foo == true' >/dev/null || echo "fail"
parse error: Invalid numeric literal at line 1, column 14
fail

But it does not work for the case of no input:

$ echo  | jq -e '.foo == true' >/dev/null || echo "fail"

Obviously, foo is not true, it's not even there, but my code do not print "fail". :-1:

It seems that if -e is used, empty input should be treated like {}, otherwise we can't rely on exit code in case of empty input.

Oct 09 '20 00:10 kolyshkin

I just came across the same problem. Trying to parse the results a curl command, but when the server goes down, my curl command returns an empty string.

> curl -sS --fail localhost:3000/health | jq -re 'has("status") and .status != "RED" // false'
curl: (7) Failed to connect to localhost port 3000: Connection refused
> echo $?
0

In my jq instructions I'm explicitly trying to handle the bad state by always returning false, but when an empty string is passed through my error case isn't being run at all.

I'd love to see an option that forces the rules to be run with an empty string instead of bypassing them altogether.

Mar 30 '22 14:03 chrisregnier

Maybe you can do something like this:

$ echo -n '{"status": "GREEN"}' | jq -esRr 'if . == "" then null else fromjson end | has("status") and .status != "RED" // false' ; echo $?
true
0
$ echo -n '{"status": "RED"}' | jq -esRr 'if . == "" then null else fromjson end | has("status") and .status != "RED" // false' ; echo $?
false
1
$ echo -n '' | jq -esRr 'if . == "" then null else fromjson end | has("status") and .status != "RED" // false' ; echo $?
false
1

Mar 30 '22 14:03 wader

That works perfectly, thank you! I didn't realize using the -sR options will cause the empty string to be passed through and then run the rules, since the rules aren't run in the other cases. So I think those two options along with the first rule you have solves this perfectly.

Mar 30 '22 14:03 chrisregnier

👍 Yeap without -sR jq will run the filter on each input JSON it reads, ex:

$ echo '' | jq .
$ echo '123' | jq .
123
$ echo '123 123' | jq .
123
123

Which i think make sense but might be a bit surprising

Mar 30 '22 15:03 wader

you can also just use:

jq -n 'try input catch null, inputs | . # your code'

or, if you want to be better about dealing with parse errors for the first input:

jq -n 'try input catch if . != "break" then error else null end, inputs | . # your code'

Mar 30 '22 16:03 emanuele6

Yeap maybe a bit clearer than use raw slurp :)

Can also do:

jq -n 'inputs // null | ...'

Mar 30 '22 16:03 wader

Can also do:

 jq -n 'inputs // null | ...'

RE: @wader

Nope, you can't do that. jq's // checks non-truthyness not emptiness; it's more similar to lua/python/javascript's or/or/|| than to perl's //. Annoying, but that is just how it is sadly; it would simplify many things if it actually only worked for empty.

inputs | . // null will replace all your false inputs with null (the only non-thruty values in jq are: false, null and empty); (also, inputs//null seems a little buggy and removes anything non-thruty as if you were doing inputs | select(.))

$ jq -n 'true, {}, null, 0, false, 10' | jq -n 'try input catch if . != "break" then error else null end, inputs'
true
{}
null
0
false
10
$ jq -n 'true, {}, null, 0, false, 10' | jq -n 'inputs | . // null'
true
{}
null
0
null
10
$ jq -n 'true, {}, null, 0, false, 10' | jq -n 'inputs // null'
true
{}
0
10

Mar 30 '22 16:03 emanuele6

Ah yeah good catch, i've been bitten by that before. Yes also wish jq had a something like the // operator but only check for emptyness. Thers is isempty but i think it will be hard to use with inputs as it will consume the inputs.

What about:

jq -n '[inputs][0] | ...'

😄

Mar 30 '22 16:03 wader

Two more:

jq -n 'reduce inputs as $i (null; $i) | ...' # null or last input
jq -n 'first(inputs, null) | ...'

Ok time to do something more useful maybe :)

Mar 30 '22 16:03 wader

RE: @wader

jq -n '[inputs][0] | ...'

I am guessing you forgot a .[1:][]:

jq -n '[ inputs ] | .[0], .[1:][]'

If the first thing you do is [ inputs ], you can just use -s:

jq -s '.[0], .[1:][]'

Yep, that's nice and short :D

I am not a big fan of it since I don't like unnecessary slurps which make jq read and load all inputs into memory before it can do anything, but that is a nice looking solution.

I think the ideal solution is:

jq -n 'try input catch if . != "break" then error else null end, inputs | . # your code'

run input/0 which reads one input and, unlike inputs/0, errors with "break" if there are no inputs to read; if you catch an error from input/0 and that error is "break", output null (otherwise reproduce the error); after reading the first input, just call inputs/0 that will read the rest of the inputs.

Mar 30 '22 16:03 emanuele6

I am guessing you forgot a .[1:][]:
jq -n '[ inputs ] | .[0], .[1:][]'

Ah yes i was thinking about the case when you only want the first input or null.

Yep, that's nice and short :D

Oh forgot about slurp, so i guess if you only care about first value it can be:

jq -s '.[0]'
# or
jq -s first

:)

I am not a big fan of it since I don't like unnecessary slurps which make jq read and load all inputs into memory before it can do anything, but that is a nice looking solution.

I think the ideal solution is:
jq -n 'try input catch if . != "break" then error else null end, inputs | . # your code'
input (unlike inputs) reads one input and errors with "break" if there are no inputs to read. if you catch an error and it's "break", return null (otherwise forward the error), after reading the first input, just call inputs and read the rest of the inputs until there is are no more or an error occurs without checking anything.

Yes also like when jq uses generators over arrays to make things like that possible.

Mar 30 '22 16:03 wader

Yes also like when jq uses generators over arrays to make things like that possible.

input/0 is great! I wish it were possible to use it and inputs/0 to iterate any jq expression, not just inputs from the input files. Sometimes I think of splitting my jq scripts into two command (jq '... | .[]' | jq -n '... | input | ...') just so that I can use it. (it's especially convenient when it's something like: jq '... | tostream' | jq -n '... | input | ...')

Here is a neat tool that I made that makes heavy usage of input/0: https://gist.github.com/emanuele6/b2f6055a5ac2cca4618f467d84f739fd

It "partitions" arrays read from stdin into multiple arrays of n elements (n is the argument you pass to the script):

$ ./partitioner.jq 2 <<< '[1,2,3] [4,5,6,7,8]'
[1,2]
[3,4]
[5,6]
[7,8]
$ ./partitioner.jq 5 <<< '[1,2,3] [4,5,6,7,8]'
[1,2,3,4,5]
[6,7,8]

Mar 30 '22 17:03 emanuele6

input/0 is great! I wish it were possible to use it and inputs/0 to iterate any jq expression, not just inputs from the input files. Sometimes I think of splitting my jq scripts into two command (jq '... | .[]' | jq -n '... | input | ...') just so that I can use it. (it's especially convenient when it's something like: jq '... | tostream' | jq -n '... | input | ...')

Idea is to use the output of something later on in a filter pipeline? would a binding work instead? or maybe this would be more similar to coroutines https://github.com/stedolan/jq/issues/1342?

Here is a neat tool that I made that makes heavy usage of input/0: https://gist.github.com/emanuele6/b2f6055a5ac2cca4618f467d84f739fd

It "partitions" arrays read from stdin into multiple arrays of n elements (n is the argument you pass to the script):
$ ./partitioner.jq 2 <<< '[1,2,3] [4,5,6,7,8]'
[1,2]
[3,4]
[5,6]
[7,8]
$ ./partitioner.jq 5 <<< '[1,2,3] [4,5,6,7,8]'
[1,2,3,4,5]
[6,7,8]

Nice! will have a look. For fq i added a chunk($size) function to do something similar as i need it quite a lot, but it only works on one input array.

Mar 30 '22 20:03 wader

In this post I would like to emphasize that to "branch" on whether the input stream is empty or not, without losing the first item if any, it is not necessary to use the -s command-line option or [inputs], both of which may be undesirable if the input stream might be very large.

The simplest, efficient, general-purpose way to distinguish between an empty and a non-empty input stream is to use the template:

jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'

That is, you would replace "empty" by the program that you want to handle the case of an empty input stream, and place the main program (say P), right after inputs, like so:

jq -n '
 def P: .;
 [try input catch infinite]
 | .[0]
 | if isinfinite then "empty" else ., inputs | P end'

Enjoy!

Mar 30 '22 22:03 pkoppstein

Re: partitioner

Note that jq has an (undocumented but internally used) builtin, _nwise/1, that partitions an array input into arrays of up to the specified length. Its can evidently be used with add to concatenate and then partition arrays. To process a stream, s, of arrays in this manner, one could write:

   # input and output are both streams of arrays
  def repartition(s; $n): [s] | add | _nwise($n);

Mar 31 '22 05:03 pkoppstein

RE: @pkoppstein

The simplest, efficient, general-purpose way to distinguish between an empty and a non-empty input stream is to use the template:
jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'

that doesn't handle parse errors for the first input correctly, but i don't think I understand how that is more simple/efficent/general-purpuse than the solution I mentioned:

jq -n 'try input catch if . != "break" then error else "empty" end, inputs'

Note that jq has an (undocumented but internally used) builtin, _nwise/1, that partitions an array input into arrays of up to the specified length.

I know _nwise/1 exists, but that does not do the same thing as my script; that has to slurp the arrays and can only output when it has finished reading all the inputs; mine doesn't have to slurp, it can process the array inputs as they come.

$ # i am typing [1,2,3,4,5] and [2] and then pressing ^D
$ jq -n 'def repartition(s; $n): [s] | add | _nwise($n); repartition(inputs; 2)'
[1,2,3,4,5]
[2]
^D
[1,2]
[3,4]
[5,2]
$ ./partitioner.jq 2
[1,2,3,4,5]
[1,2]
[3,4]
[2]
[5,2]
^D

RE: @wader

Idea is to use the output of something later on in a filter pipeline?

This is probably getting a little OT, but the idea is to be able to easily read an arbitrary number of values at each iteration without having to use reduce and only being able to output at the end, or having to use foreach and [[state],actual_value_to_output]:

example: split an array into multiple arrays at every null.

$ # with input/0
$ jq -n '[1,2,3,null,4,6,false,5,null,12,3,4,5,7,7,6][]' | jq -cn 'try repeat(1 | [ while(. != null; ([ input ]? // error)[]) ]) catch . | arrays[1:]'
[1,2,3]
[4,6,false,5]
[12,3,4,5,7,7,6]
$ # with foreach
$ # you lose values at the end if the array is not null terminated:
$ jq -cn '[1,2,3,null,4,6,false,5,null,12,3,4,5,7,7,6] | foreach .[] as $v ([ false, [] ]; if .[0] then [false,[]] else . end | if $v != null then .[1] += [ $v ] else .[0] = true end; select(.[0])[1] | arrays)'
[1,2,3]
[4,6,false,5]
$ jq -cn '[1,2,3,null,4,6,false,5,null,12,3,4,5,7,7,6,null] | foreach .[] as $v ([ false, [] ]; if .[0] then [false,[]] else . end | if $v != null then .[1] += [ $v ] else .[0] = true end; select(.[0])[1] | arrays)'
[1,2,3]
[4,6,false,5]
[12,3,4,5,7,7,6]
$ # with reduce; can only output at the end
$ jq -cn '[1,2,3,null,4,6,false,5,null,12,3,4,5,7,7,6] | reduce .[] as $v ([[]]; if $v == null then . + [ [] ] else .[-1] += [ $v ] end) | .[]'
[1,2,3]
[4,6,false,5]
[12,3,4,5,7,7,6]

This is a simple example, but, when you need to take an arbitrary number of values at each iteration from an array or stream (something that you often want to do when reconstructing a value from tostream/0 output or --stream) and you want to output whenever possible instead of having to wait until you have iterated the whole array or stream, an input/0 loop is usually more convenient to use compared to foreach .[] as $v/foreach inputs as $v or reduce .[] as $v/reduce inputs as $v.

CONS:

reduce:
- slurps
- must wait until the end of the stream before it can output
foreach:
- can't output final incomplete value
input/0 loop:
- may make the code look too procedural

Mar 31 '22 09:03 emanuele6

@emanuele6 - Please note that my two most recent posts above were not addressed to you because they were not intended as a comment on or critique of your contributions.

The first of these two posts (try input catch infinite) was motivated in part by the fact that your response, while addressing the OP's question, did not address the related question about branching on whether the input stream is empty or not. A case of apples and oranges if you like.

It's also a case of apples and oranges with respect to your partitioner.jq script and my repartition function. I did not mean to suggest that you did not know about _nwise or that my very simple program was equivalent to your much more elaborate one. I was just pointing out how the functionality illustrated in your original "partitions" post could be achieved using _nwise

By the way, for anyone interested in a simple jq program to repartition in an incremental manner (in particular, without concatenating the arrays), here's one solution [with correction since first posting]:

# s is assumed to be a stream of arrays
def repartition(s; $n):
  foreach (s,null) as $a (null;  # {emit, buffer}
     if $a == null then {emit: .buffer}
     elif $a == [] then .
     else .buffer += $a
     | (.buffer|length) as $len
     | if $len >= $n
       then ($len % $n) as $x
       | {emit: .buffer[: $len - $x], buffer: (if $x>0 then .buffer[-$x:] else null end)}
       else .emit = null
       end
     end;
     select(.emit).emit | _nwise($n) );

Example: repartition([1,2],[3,4,5],[6,7]; 2)

Apr 01 '22 07:04 pkoppstein

RE: @pkoppstein

Please note that my two most recent posts above were not addressed to you because they were not intended as a comment on or critique of your contributions

I was just trying to figure out how it was different from mine; it looked like just mine, but with the extra step of using infinite and with incorrect handling of parse errors for the first input.

The first of these two posts (try input catch infinite) was motivated in part by the fact that your response, while addressing the OP's question, did not address the related question about branching on whether the input stream is empty or not. A case of apples and oranges if you like.

jq -n 'try input catch if . != "break" then error else "empty" end, inputs'

If you want to branch on the case in which there are no inputs with mine, you just have to write code where I wrote "empty" and the parse errors for the first input were handled properly in mine, so I couldn't figure out why you did the extra steps. (I thought it about it for quite a bit, that is why I asked.)

$ printf '\n' | jq -n 'try input catch if . != "break" then error else "empty" end, inputs'
"empty"
$ printf '%s\n' '"hello"' '"hi"' | jq -n 'try input catch if . != "break" then error else "empty" end, inputs'
"hello"
"hi"
$ printf '%s\n' '"hello"' 'hi' | jq -n 'try input catch if . != "break" then error else "empty" end, inputs'
"hello"
jq: error (at <stdin>:2): Invalid numeric literal at line 3, column 0
$ printf '%s\n' 'hello' '"hi"' | jq -n 'try input catch if . != "break" then error else "empty" end, inputs'
jq: error (at <stdin>:1): Invalid numeric literal at line 2, column 0
$
$ printf '\n' | jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'
"empty"
$ printf '%s\n' '"hello"' '"hi"' | jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'
"hello"
"hi"
$ printf '%s\n' '"hello"' 'hi' | jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'
"hello"
jq: error (at <stdin>:2): Invalid numeric literal at line 3, column 0
$ printf '%s\n' 'hello' '"hi"' | jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'
"empty"

By the way, for anyone interested in a simple jq program to repartition in an incremental manner (in particular, without concatenating the arrays), here's one solution:
# s is assumed to be a stream of arrays
def repartition(s; $n):
  foreach (s,null) as $a (null;  # {emit, buffer}
     if $a == null then {emit: .buffer}
     else .buffer += $a
     | if (.buffer|length) >= $n
       then {emit: .buffer[:$n], buffer: .buffer[$n:]}
       else .emit = null
       end
     end;
     select(.emit|length>0).emit );

That could be a possible solution, but it should be allowed to emit more than one value per iteration otherwise it will output arrays a lot after they were input if it gets arrays larger than $n (that should be the usual case), and, more importantly, it can build up a huge buffer (if there are not enough arrays with length < $n to balance the larger ones) that will be output entirely at the end.

for example:

$ # i am entering [1,2,3,4,5,6,7], ["a","b","c","d","e"] and ^D
$ jq -cn '# s is assumed to be a stream of arrays
def repartition(s; $n):
  foreach (s,null) as $a (null;  # {emit, buffer}
     if $a == null then {emit: .buffer}
     else .buffer += $a
     | if (.buffer|length) >= $n
       then {emit: .buffer[:$n], buffer: .buffer[$n:]}
       else .emit = null
       end
     end;
     select(.emit|length>0).emit );
repartition(inputs; 3)'
 [1,2,3,4,5,6,7]
[1,2,3]
 ["a","b","c","d","e"]
[4,5,6]
^D
[7,"a","b","c","d","e"]

possible fix:

# s is assumed to be a stream of arrays
def repartition(s; $n):
  foreach (s,null) as $a ({emit: []};
    if $a == null then {emit: [ .buffer // empty ]}
    else .buffer += $a
    | if (.buffer|length) < $n then .emit = []
      else [ .buffer | _nwise($n) ]
      | if (.[-1]|length) == $n
        then {emit: .}
        else {emit: .[:-1], buffer: .[-1]}
        end
      end
    end;
    .emit[]);

Apr 01 '22 09:04 emanuele6

@emanuele6 - To see the difference, suppose we want to branch on whether the input stream is empty. If jq had a side-effect-free version of isempty/1, we would write something like:

   if sideffect_free_isempty(inputs) then X else E end. # not currently possible

Using the 'infinite' template, we have only to write:

jq -n '[try input catch infinite] | .[0] | if isinfinite then X else ., inputs | E end'

or, taking into account your concern:

jq -n '[try input catch if . == "break" then infinite else error end] | .[0] | if isinfinite then X else ., inputs | E end'

Using your approach, we would have:

jq -n 'try input catch if . != "break" then error else X end, inputs | P'

So the difference is now obvious: with your program, an empty input stream results in X|P rather than just P.

I've fixed the incremental version of repartition. Thanks.

Apr 01 '22 15:04 pkoppstein

Oh, right. I didn't think of that for some reason! Thank you.

Apr 02 '22 13:04 emanuele6

jq jq copied to clipboard

Add an option to treat empty input as null input

Examples:

jq
jq copied to clipboard