jq icon indicating copy to clipboard operation
jq copied to clipboard

Allow setting raw input delimiter

Open phiresky opened this issue 10 years ago • 9 comments

As far as I can tell, this is not currently possible?

main use case: find | jq -R works

but because filenames can contain newlines that is not safe, so I'd like to use find -print0, but jq does not allow setting \0 as the input delimiter (or setting it at all).

It can be circumvented with

find -print0|jq --slurp --raw-input 'split("\u0000")[]'

but that disables streaming the input

Usage in other programs (for \0):

  • xargs -0 or xargs --null
  • sed -z or sed --null-data

phiresky avatar Sep 28 '15 18:09 phiresky

I agree.

nicowilliams avatar Sep 28 '15 18:09 nicowilliams

I have a concern. If we enable setting the delimiter, do we automatically convert newline characters to \n in that mode? Otherwise we end up with invalid json strings. I feel this may be an example of an input that should be processed with sed or something before feeding into jq.

On Mon, Sep 28, 2015 at 2:27 PM Nico Williams [email protected] wrote:

I agree.

— Reply to this email directly or view it on GitHub https://github.com/stedolan/jq/issues/965#issuecomment-143835020.

wtlangford avatar Sep 28 '15 19:09 wtlangford

Yes, probably, like slurp

I feel this may be an example of an input that should be processed with sed or something before feeding into jq.

But how would that work? Without stopping streaming?

phiresky avatar Sep 28 '15 19:09 phiresky

@wtlangford Strings can contain newlines. Newlines in strings have to be escaped in encoded JSON texts, but here we're not dealing with JSON texts, as the input is raw, and the output of the "parser" is a jv string to feed to the jq VM.

nicowilliams avatar Sep 28 '15 19:09 nicowilliams

Fair enough. I'm convinced.

On Mon, Sep 28, 2015, 15:27 Nico Williams [email protected] wrote:

@wtlangford https://github.com/wtlangford Strings can contain newlines. Newlines in strings have to be escaped in encoded JSON texts, but here we're not dealing with JSON texts, as the input is raw, and the output of the "parser" is a jv string to feed to the jq VM.

— Reply to this email directly or view it on GitHub https://github.com/stedolan/jq/issues/965#issuecomment-143850502.

wtlangford avatar Sep 28 '15 19:09 wtlangford

Having the same problem processing zsh history files, which use newlines between records, but may contain escaped newlines within records. I got sed to insert NULs to disambiguate records and then I bumped into this issue. Eventually had to do this backwards, getting sed to replace escaped newlines with NULs and keep newline as record separator in order to keep jq happy. The workaround was easy enough, but it would be really nice if jq would support NUL delimiter as per @phiresky's original comment.

nkgm avatar Oct 29 '18 11:10 nkgm

We should add a -0 at least, and preferably also a -F CHAR or some appropriately-named long option.

nicowilliams avatar Dec 17 '18 17:12 nicowilliams

The -0 option got added already, personally I think that is enough and this issue can be closed now.

BTW, as pointed out in #1271, JSON strings can contain both LF ("\n") and NUL ("\u0000") so -0 is not sufficient for preventing recipients from getting the wrong amount of result strings (as is -r of course).

pabs3 avatar Sep 21 '21 07:09 pabs3

Comes from https://github.com/wader/fq/issues/1019

I also expect jq can be an alternative for perl/sed/awk. fq have imported --raw-output0 to set output seperator. a -0 or --raw-input0 can be good.

Freed-Wu avatar Oct 15 '24 07:10 Freed-Wu