jq icon indicating copy to clipboard operation
jq copied to clipboard

support yaml input/output in addition to json

Open underrun opened this issue 11 years ago • 64 comments

i know its a big request - but you never know unless you ask right?

underrun avatar Jul 07 '14 21:07 underrun

The plan is to have multiple parser options at some point. But only JSON data types (null, boolean, string, number, array, object) will ever be supported.

nicowilliams avatar Jul 07 '14 22:07 nicowilliams

i suppose that means limits on object keys ala json as well even tho yaml supports just about anything as a key in a mapping?

underrun avatar Jul 08 '14 14:07 underrun

Correct. jq internally assumes the JSON (not JavaScript) data model. No NaNs, no infinities, no non-string keys, no difference between tuples and arrays, only null, booleans, numbers, strings, arrays, and objects.

Changing this would require a ton of work and might require deep changes to the language. YAML being a superset of JSON, is kinda out, but a subset of YAML could be OK.

nicowilliams avatar Jul 08 '14 15:07 nicowilliams

@nicwilliams wrote:

The plan is to have multiple parser options at some point.

One feature request I'd like to make if it isn't already covered by an existing "issue" is what I'll call the "Farewell to awk" flag -- an option to read in a UTF8 file as a sequence of (JSON) strings, one string per line. Is that covered elsewhere?

pkoppstein avatar Jul 08 '14 16:07 pkoppstein

@pkoppstein There's already raw input and raw output options.

nicowilliams avatar Jul 08 '14 16:07 nicowilliams

@nicowilliams - Thanks!

pkoppstein avatar Jul 08 '14 16:07 pkoppstein

This would definitely be useful. Output is straightforward enough, and input could be done if limited to that subset of YAML that's equivalent to JSON. The JSON-equivalent subset of YAML is a useful language for configuration files and the like (Full YAML is scarily complicated).

Values in jq are always acyclic, so YAML's recursion won't be supported.

stedolan avatar Jul 31 '14 15:07 stedolan

@stedolan My plan is to have multiple parsers/encoders accessible from the I/O builtins.

I want parsers for:

  • YAML subset (per-this issue and your comment)
  • streaming JSON (output pairs of path and leaf value)

nicowilliams avatar Jul 31 '14 15:07 nicowilliams

Is this something you're planning to work on in the near future? If not, is it something that's feasible for a newcomer to the project to work on?

abesto avatar Jan 08 '15 09:01 abesto

I should add that while supporting type annotations on input is not unreasonable, jq will not support "date" or other such types internally, therefore also not on output (though an output module could use schema to insert type annotations).

Ideally all such parsers and encoders could be written in jq itself. We'll need a byte at a time raw input mode for that :(

nicowilliams avatar Jan 08 '15 14:01 nicowilliams

I should also add that full YAML will never be supported by jq, for obvious reasons :)

nicowilliams avatar Jan 12 '15 16:01 nicowilliams

I found this https://github.com/preaction/ETL-Yertl a perl module that is similar for yaml

jmdcal avatar Apr 23 '15 14:04 jmdcal

For those of you who would like to use jq filters on YAML documents, I have written a simple wrapper script that transforms YAML to JSON with a simple python filter, invokes jq on the JSON and then transforms the JSON back to YAML with python.

The script is implemented in bash and will use a locally installed version of jq and python if they are on the path or delegate to a docker container containing the same otherwise.

See http://github.com/wildducktheories/y2j for more details and let me know if you have any feedback.

jonseymour avatar Aug 02 '15 11:08 jonseymour

Transforming yaml to json will lose something - references... knowing something is the "same" object matters for some applications.

underrun avatar Aug 02 '15 14:08 underrun

@jonseymour Awesome! I added a link to y2j in the wiki.

dtolnay avatar Aug 02 '15 17:08 dtolnay

@underrun Understood. I have updated the LIMITATIONS section of the README to add further clarification of this point. @dtolnay - thanks!

jonseymour avatar Aug 02 '15 22:08 jonseymour

Hmm, there's also this one: https://github.com/abesto/yq

No commits for 2 years then one a month ago

SamuelMarks avatar Feb 20 '16 01:02 SamuelMarks

I wrote my own jq wrapper supporting YAML input and output. To use it clone the repository jqt and run make install. The yq script uses Bash and Python with the PyYAML module. The script tries to imitate at maximum the jq command line interface, showing help for example and not requiring redirection of input:

$ yq '.store.book[2]' data/store.yaml
author: Herman Melville
category: fiction
isbn: 0-553-21311-3
price: 8.99
title: Moby Dick

There are also some enhancements, like outputing directly JSON:

$ yq --json -c '.store.book[2]' data/store.yaml
{"category":"fiction","price":8.99,"author":"Herman Melville","isbn":"0-553-21311-3","title":"Moby Dick"}

This is yq on screen help:

$ yq --help
yq - commandline YAML processor
Usage: yq [options] <jq filter> [file]

    yq is a wrapper to jq for processing YAML input, applying the given
    filter to it YAML text input and producing the filter's results as
    YAML or JSON on standard output.

    The options available are yq specific and also from jq. The yq
    options are:
     -h     Show this help
     -J     Preserve JSON output format
     -V     Output the jq version

    Some of the jq options include:
     -e     set the exit status code based on the output
     -f     Read filter from the file f
     -s     read (slurp) all inputs into an array; apply filter to it
     -S     sort keys of objects on output
     --arg a v          set variable $a to value v
     --argjson a v      set variable $a to JSON value v
     --slurpfile a f    set variable $a to an array of values read from f
    Not all jq options have sense using yq.

    For more advanced filters see the jq(1) manpage and
    https://stedolan.github.io/jq

JJOR

fadado avatar May 23 '16 06:05 fadado

@nicowilliams @wtlangford, I am using jq a lot with docker inspect and Docker-in-docker with very lean Alpine Linux images where support for YAML for docker-compose.yml would be very welcome and increase the usefulness of jq manyfold for me and others.

Because it has so few dependencies it is much more preferable to other tools that need Python, Lua, Perl or similar. Keeping Docker images as lean as possible helps encapsulate specialised build environments/images and other usecases with container images for continuous integration/delivery. jq will easily become an integral staple on all Alpine Docker images - especially with YAML support, so we wouldn't need any additional or alternative tools for custom Docker/Snappy toolchains. 🐳 📦

runelabs avatar Jul 10 '16 22:07 runelabs

here's "yq" instead of jq: https://gist.github.com/earonesty/1d7cb531bb8fff8c228b7710126bcc33

earonesty avatar Dec 28 '16 18:12 earonesty

@nicowilliams Please advise.

szepeviktor avatar Dec 28 '16 18:12 szepeviktor

FYI, I have been maintaining https://github.com/kislyuk/yq, which is a straightforward wrapper for jq using PyYAML. (It's also published on PyPI so you can pip install yq - this supersedes https://github.com/abesto/yq).

kislyuk avatar Dec 27 '17 02:12 kislyuk

@kislyuk - Thank you! The jq FAQ now includes the https://github.com/kislyuk/yq link.

pkoppstein avatar Dec 27 '17 03:12 pkoppstein

@pkoppstein Could you point out where the link is?

szepeviktor avatar Dec 27 '17 08:12 szepeviktor

@szepeviktor - Thanks for paying attention!

pkoppstein avatar Dec 27 '17 16:12 pkoppstein

While I don't have a need for YAML input (and YAML input would certainly be lossy), having a YAML output mode would be extremely handy for being able to generate configuration inputs for tools that don't accept JSON input.

mikemol avatar Jul 13 '18 11:07 mikemol

@mikemol, YAML is a natural superset of JSON. Any tool that accepts YAML should also accept JSON as output by jq.

thedward avatar Jul 14 '18 01:07 thedward

Semantically, sure. Syntactically, not so much. See, for example, ConcourseCI's fly command, which accepts YAML, not JSON, as input, even though JSON would be strictly sufficient from a semantics perspective based on how it's used by the program.

On Fri, Jul 13, 2018, 9:14 PM Thedward Blevins [email protected] wrote:

@mikemol https://github.com/mikemol, YAML is a natural superset of JSON http://yaml.org/spec/1.2/spec.html#id2759572. Any tool that accepts YAML should also accept JSON as output by jq.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stedolan/jq/issues/467#issuecomment-404988187, or mute the thread https://github.com/notifications/unsubscribe-auth/AAA07mnw2i2_W1QH_iTWmo7zjZ7Dlrg0ks5uGUX5gaJpZM4CK7-6 .

mikemol avatar Jul 14 '18 01:07 mikemol

Valid json without duplicated keys in an object is syntactically valid flow style yaml.

Unless a feature json doesn't support is needed (like refs or multiple documents per stream) then jq output will syntactically work with any tool that takes yaml as input.

If there is need for a feature json doesn't support then just outputting in block style yaml will have the same issues.

There is value in a tool like jq (or jq itself) that actually supports all the features of yaml, but it really is only in cases where actual features of yaml that json can't support are really needed. Definitely limited use, but not completely zero.

underrun avatar Jul 14 '18 02:07 underrun

@mikemol - The jq FAQ mentions several YAML-related tools that might be of interest to you, especially https://github.com/dbohdan/remarshal as it includes JSON-to-YAML functionality.

pkoppstein avatar Jul 14 '18 03:07 pkoppstein

One could also use this tool that is written in Golang if one does not want to install pip.

030 avatar Oct 28 '18 21:10 030

Might someone write and contribute a fromyaml not unlike the fromcsv in #1650? That'd be great.

nicowilliams avatar Dec 17 '18 17:12 nicowilliams

I just released yjq with similar functionality. It's written in Go and the release page provides binaries for all major os/arch. It also supports YAML Input -> JSON output and JSON Input -> YAML output allowing it to be used with jq pipes.

alxarch avatar Feb 01 '19 16:02 alxarch

You can also try https://github.com/woky/tojson.

woky avatar Feb 05 '19 22:02 woky

I created pretty small static binaries for this need at https://github.com/gsf/yaml2json/releases.

gsf avatar Feb 06 '19 14:02 gsf

Here's something similar that supports even more formats (BSON, Bencode, JSON, TOML, XML, YAML): https://github.com/jzelinskie/faq

njhale avatar Feb 23 '19 00:02 njhale

And there are also multiple candidates who want to be what jq is for json:

https://github.com/mikefarah/yq/ https://github.com/mikefarah/yq/releases/tag/2.2.1

tdussmann avatar Mar 01 '19 15:03 tdussmann

The mikefarah/yq is available in the homebrew but it has a lack of features like filter, etc. The best case is to convert to json, make the work, and covert the result back to yaml 😛

yq r -j file.yq | jq $filter_command | yq -

voiski avatar Apr 16 '19 17:04 voiski

@voiski: The original yq, https://github.com/kislyuk/yq, does exactly that. It is available in Homebrew as python-yq.

kislyuk avatar Apr 16 '19 18:04 kislyuk

Thanks, @kislyuk, it is a good idea to wrap the solution. JQ is much ahead of any yaml processor solution. For ci, I will still prefer to use @mikefarah solution that is one single binary avoiding the need of bake python inside a docker.

voiski avatar Apr 16 '19 22:04 voiski

So trivial and wanted feature... and it has never been implemented. So we need to use some external tools to just convert messy and bloated JSON to clean YAML. Мдя...

OnkelTem avatar May 03 '19 21:05 OnkelTem

still looking for @yml or --yaml-output option.

DonBower avatar May 09 '19 01:05 DonBower

@DonBower Could you give an example what you are looking for? Does this tool help?

030 avatar May 09 '19 08:05 030

For YAML output it would be enough to to implement a jq module (aka jq function library) that defines function toyaml. The counterpart may be more difficult and create a performance bottleneck.

nichtich avatar May 10 '19 07:05 nichtich

On Fri, 10 May 2019, 09:23 Jakob Voß, [email protected] wrote:

For YAML output it would be enough to to implement a jq module (aka jq function library) that defines function toyaml. The counterpart may be more difficult and create a performance bottleneck.

Perhaps this module

https://github.com/fadado/JBOL/blob/master/fadado.github.io/json/json.jq

implementing toxml (wrapped inside xmldoc) can be easily adapted to implement toyaml.

This shows how to use the module:

https://github.com/fadado/JBOL/blob/master/bin/jxml.jq https://github.com/fadado/JBOL/blob/master/bin/jxml

JJOR

fadado avatar May 10 '19 08:05 fadado

still looking for @yml or --yaml-output option.

@DonBower If you check the discussion, you will see some options using a combination of tools. You can use tdusmann answer like yq r -j file.yq | jq $filter_command | yq - or the wrapper built by kislyuk that does exactly that.

@030 Use upx to make it smaller, because right now the mikefarah/yq solution is not just smarter but also smaller. We don't need power tools for that, jq is insuperable as parser handler, but we need some good solution to have yaml support =)

voiski avatar May 10 '19 15:05 voiski

When I have the time to get back to jq I'll finish up my I/O branch, and maybe we can then add options for input/output formats, then someone could contribute support for YAML and XML.

nicowilliams avatar May 10 '19 15:05 nicowilliams

@nicowilliams As @OnkelTem mentioned, we have a lot of external solutions with muti-os support. So, I don't see it as a big problem to put pressure.

I believe some folks, including myself, are missing the back messages that bring solutions for that, some more simple and others more complex and complete(I missed that one too).

voiski avatar May 10 '19 16:05 voiski

Here is an incomplete jq snippet to serialize YAML.

nichtich avatar Jun 06 '19 20:06 nichtich

Not to add another tool to the list, but I recently created https://github.com/Blacksmoke16/oq.

Has the benefits of wrapping jq, being nearly as performant, and is portable via a singular binary.

I wrote a blog post about it thats shows some of its features, as well as some benchmarks. Is still some work that needs to be done but is off to a good start.

Blacksmoke16 avatar Jul 13 '19 23:07 Blacksmoke16

@Blacksmoke16 - Very impressive! I'm also impressed that you used Crystal. I've been waiting, evidently in vain, for Version 1.0, and would be interested to know what your experience has been regarding the pluses and minuses of using Crystal for your project.

pkoppstein avatar Jul 14 '19 02:07 pkoppstein

@pkoppstein It's been quite positive. The syntax and the performance come together quite nicely to make for a nice experience. The other benefits of a single binary and quite extensive standard library made making this pretty easy.

I'd say the main downside currently is you can run into some compiler bugs every now and then which force you to find workarounds until they get fixed. As long as you keep an eye on the changelog, breaking changes haven't been too bad for me.

Also since it's relatively new, library support is not as mature as say Ruby; I find myself spending most of my time making PRs for the libraries I use. However, this project in particular went pretty smooth as it's not too complex, or relying upon fancier features like generics or macros.

If you want to chat more about it feel free to message me on Discord Blacksmoke16#0016.

Blacksmoke16 avatar Jul 14 '19 02:07 Blacksmoke16

I created pretty small static binaries for this need at https://github.com/gsf/yaml2json/releases.

It's funny how 1.8M can be considered "pretty small" these days.

tpo avatar Apr 04 '20 14:04 tpo

I now routinely use yj to handle translation from YAML to JSON prior to feeding into jq. I selected that tool since it was stupid-simple, and could also handle HCL and TOML.

mikemol avatar Jun 07 '20 03:06 mikemol

@mikemol could you please share a link to it?

OnkelTem avatar Jun 08 '20 09:06 OnkelTem

@OnkelTem https://github.com/sclevine/yj

mikemol avatar Jun 08 '20 11:06 mikemol

Many tools exist to convert between YAML and JSON but all require additional runtime environments, methods of installation, usage etc. Having jq as single binary with support of several operating system package managers would be most convenient. But I understand if this will not be implemented.

nichtich avatar Jun 09 '20 13:06 nichtich

The other issue is that YAML allows comments and tools must encode these if they are desired to be kept.

Of course, jq would have the same issue....unless it added core comment support (e.g. for json5).

pauldraper avatar Jun 15 '20 05:06 pauldraper

@nicowilliams I motion to take this issue off of life support, as much as I would love to be able to keep everything in one project, the "j" in jq does a fine job for what it is.

captain-kark avatar Jun 18 '20 20:06 captain-kark

For those still interesting in support of YAML parsing into jq, have a look at https://github.com/biojppm/rapidyaml#comparison-with-yaml-cpp. In contrast to converters mentioned before, it is implemented in C++ so it may be easier to build into jq source code. I'd still take into consideration support of parsing extensions of JSON if the underlying parsing engine does it support anyway.

nichtich avatar Jun 18 '20 20:06 nichtich

I am sad that this issue hasn't had discussion in two years. JSON is a strict subset of yaml, it makes a certain amount of sense just to use a yaml parser on the front end and take all comers. Give me a flag for output in yaml instead of a default json and i'd be super happy.

If there is a poll going I could care less about comments. 99% of the time I use jq I'm transforming json to pass into some other system like kubernetes and none of them care about comments.

SleepyBrett avatar Apr 27 '22 18:04 SleepyBrett

I also use jq to validate JSON, so I'd argue against changing the default input format to accept anything more than sequences of valid JSON documents. YAML output is a different use, it could be supported like existing format strings and escaping @csv, @tsv.... if someone is willing to implement it.

nichtich avatar Apr 29 '22 07:04 nichtich

@nichtich JSON is a subset of yaml. A full json document is 100% valid yaml. Replacing the underlying parsing engine with a yaml parser would not effect you at all.

SleepyBrett avatar Apr 29 '22 20:04 SleepyBrett

@SleepyBrett except when you want to stricly accept json but not yaml.

Nothing4You avatar Apr 29 '22 21:04 Nothing4You