Add builtin to output to file
I would like to propose a new builtin, say output_file("filename"; "contents"), which copies its input to its output while saving its arguments as a file. It's similar in principle to the debug builtin.
If --sandbox is specified, it will simply output the filename to stderr, essentially acting as a dry run.
Having this builtin would make it less awkward to split json files. See below for some workarounds that are currently required:
- #2438
- #3121
- https://stackoverflow.com/questions/70569726/jq-split-json-in-several-files
Proposed semantics:
# sample script
to_entries[] | output_file(.key; .value) | .key
# stdin
{
"a.txt": "string\nstring",
"b/c.txt": "invalid",
"d.json": {
"e": 10
}
}
# stdout
"a.txt"
"b/c.txt"
"d.json"
# stderr
b/c.txt: No such file or directory
# a.txt
string
string
# d.json
{"e":10}
If you ok with using fq i've used a tar hack a few times to output multiple files. Something like this:
Copy tar code from https://github.com/wader/fq/wiki/snippets into tar.jq then
$ fq -n -L . 'include "tar"; to_tar({filename: "a", data: "aaa"}, {filename: "b", data: "bbb"})' | tar tv
-rw-r--r-- 0 user group 3 Jan 1 1970 a
-rw-r--r-- 0 user group 3 Jan 1 1970 b
Maybe you could rewrite the tar code to work with standard jq but then as jq does not support raw binary output you might be limited to just ASCII data in files etc.
Simple way of doing this is outputting a shell script from jq. That's how @sh is used for.
jq -r 'to_entries[] | @sh"echo \(.value|tostring) > \(.key)"' | sh
Simple way of doing this is outputting a shell script from jq. That's how
@shis used for.
In general, I also prefer outputting shell scripts over something like #3133 (although I didn't know about @sh for escaping - thanks for that!)
However, there's been several times where I've had to split the output into multiple shell scripts, and the resulting doubly-escaped script was a headache to review, which was what inspired this proposal.
Presumably, there's many other places where having this as a builtin would make for a nice quality-of-life improvement.
Agree that it would be nice with more I/O features. In my view the biggest issue is how to make it all fit nicely together, e.g https://github.com/jqlang/jq/pull/1843 includes file handles support that would make some of this possible to implement as builtins i think. Then also what would be good names and API? input/1 to read a file as JSON, string or how to specify? output/1 to write? tee/1 to write and pass thru? things like that.
Maybe a way forward could be to flash out how these API could look like and be used by a user and then maybe see what subset could be implement without major changes? that way we could minimize risk of adding something that turns out to be incompatible or awkward to combine with future fancier I/O, coeval, etc support.
Agree that it would be nice with more I/O features. In my view the biggest issue is how to make it all fit nicely together, e.g #1843 includes file handles support that would make some of this possible to implement as builtins i think.
For IO, I would advocate for having very few individually tailored high-level primitives, rather than many low-level building blocks like in that PR.
Due to the nature of jq being a functional language, interacting with the outside world is a much more advanced feature than usual[^1], and can end up being surprisingly asymmetrical (see below). [^1]: For another example of typically standard functionality being treated as an advanced feature, note that jq officially considers variables an advanced feature
I'm even willing to be convinced that IO doesn't even belong in jq at all (hence this proposal being opened as an issue rather than a PR).
input/1to read a file as JSON, string or how to specify?
I would actually advocate for something like an --input-var option instead, which reads all files into an $input variable containing a filename-to-contents map (essentially a more generalized form of --slurpfile and --rawfile)
Usage would be something like:
jq --input-var '$input | .["a.json"]' *.json
Maybe a way forward could be to flash out how these API could look like and be used by a user and then maybe see what subset could be implement without major changes? that way we could minimize risk of adding something that turns out to be incompatible or awkward to combine with future fancier I/O, coeval, etc support.
Another way to manage this risk could be to prefix experimental APIs (for example, this could be named _exp_output_file), and print warnings that the functionality is subject to change.