jsonnet icon indicating copy to clipboard operation
jsonnet copied to clipboard

Add an option to avoid unnecessary quoting in manifested YAML

Open grosser opened this issue 4 years ago • 10 comments

expected:

apiVersion: apps/v1

actual:

"apiVersion": "apps/v1"

grosser avatar Jun 13 '20 01:06 grosser

Precedent: #682 and #717.

seh avatar Jun 13 '20 02:06 seh

It's a duplicate for https://github.com/google/jsonnet/issues/717 for sure, but https://github.com/google/jsonnet/issues/682 is unrelated ... using single-quotes is a thing very few ppl need afaik so would prefer keeping this issue or reopening https://github.com/google/jsonnet/issues/717 and closing this

grosser avatar Jun 13 '20 03:06 grosser

You're right, this is different. Choosing which strings to quote in YAML is a tricky issue. See the famous Norway Problem.

If you could find a comprehensive set of rules of when we could omit the quotes, it would help a lot here.

sbarzowski avatar Jun 13 '20 08:06 sbarzowski

Let's start with keys, the main noise, anything that does not include : should not be escaped.

>> require 'yaml'
=> true
>> {'a-b' => 1}.to_yaml
=> "---\na-b: 1\n"
>> {'a -b' => 1}.to_yaml
=> "---\na -b: 1\n"
>> {'a:-b' => 1}.to_yaml
=> "---\na:-b: 1\n"
>> {'a: -b' => 1}.to_yaml
=> "---\n'a: -b': 1\n"
>> {'a:b∂' => 1}.to_yaml
=> "---\na:b∂: 1\n"

... to play it super-safe could even do anything that does not only include [a-zA-z\d_-\.\/] and that should cover the 99% case

... we can address values later if anyone cares

https://symfony.com/doc/current/components/yaml/yaml_format.html is a pretty good starting point for any future "perfect" implementation ... but let's just solve 99% of the problem with 1% of the effort ;)

grosser avatar Jun 13 '20 17:06 grosser

No, this is not safe. Consider the following YAML:

no: 42

It is canonicalized to the following JSON:

{
    "false": 42
}

Try it in https://yaml-online-parser.appspot.com/.

There's also a more subtle problem of things that have a different type when they're not escaped, e.g. true, false and numbers.

Since you mentioned ruby, please take a look at some special cases that they implement:

irb(main):010:0> {"no" => 42}.to_yaml
=> "---\n'no': 42\n"
irb(main):011:0> {"nope" => 42}.to_yaml
=> "---\nnope: 42\n"
irb(main):012:0> {"111" => 42}.to_yaml
=> "---\n'111': 42\n"
irb(main):013:0> {"111a" => 42}.to_yaml
=> "---\n111a: 42\n"

There are probably quite a few more gotchas like that, which I am not aware of. So it's not nearly as simple as you described. Someone needs to dive deeply in YAML spec and figure out what needs to be escaped.

sbarzowski avatar Jun 13 '20 17:06 sbarzowski

can we use some yaml c bindings or whatever language it can interact with ?

grosser avatar Jun 13 '20 17:06 grosser

Not really. We want all the behavior to be well-defined and the best way to ensure that is to have reference implementation in Jsonnet whenever possible.

In principle everyone should be able to take Jsonnet spec + stdlib code and produce a completely compatible implementation for any platform using any language (it's not 100% like that, but that's the ideal we are aiming for). And there are other existing Jsonnet implementations used in production (google/go-jsonnet, databricks/sjsonnet) and we want to keep them all compatible. We sometimes use external libraries, but we need to have a very good reason.

sbarzowski avatar Jun 13 '20 18:06 sbarzowski

Someone needs to dive deeply in YAML spec and figure out what needs to be escaped.

Hopefully the work could be limited somewhat by considering only keys (not values), as @grosser suggested.

Also, it could be an opt-in behavior, via a new named parameter on std.manifestYamlStream that defaults to false. E.g. prettyPrintKeys=false. That way, users of Jsonnet get to choose whether they want the new behavior or not.

JohnRusk avatar Aug 23 '20 19:08 JohnRusk

Hopefully the work could be limited somewhat by considering only keys (not values), as @grosser suggested.

I'm afraid it doesn't help much. Per YAML spec 7.4.2: "Note that YAML allows arbitrary nodes to be used as keys. In particular, a key may be a sequence or a mapping. " – so you still need to check if it's not going to be interpreted as any other YAML construct.

Also, it could be an opt-in behavior, via a new named parameter on std.manifestYamlStream that defaults to false. E.g. prettyPrintKeys=false. That way, users of Jsonnet get to choose whether they want the new behavior or not.

Yes, that would be a good idea.

sbarzowski avatar Aug 24 '20 03:08 sbarzowski

Some related work: https://github.com/databricks/sjsonnet/pull/90. If we could this sort of processing as a Jsonnet library it would be a first step here.

sbarzowski avatar Sep 08 '20 17:09 sbarzowski