jsonnet
jsonnet copied to clipboard
Add an option to avoid unnecessary quoting in manifested YAML
expected:
apiVersion: apps/v1
actual:
"apiVersion": "apps/v1"
Precedent: #682 and #717.
It's a duplicate for https://github.com/google/jsonnet/issues/717 for sure, but https://github.com/google/jsonnet/issues/682 is unrelated ... using single-quotes is a thing very few ppl need afaik so would prefer keeping this issue or reopening https://github.com/google/jsonnet/issues/717 and closing this
You're right, this is different. Choosing which strings to quote in YAML is a tricky issue. See the famous Norway Problem.
If you could find a comprehensive set of rules of when we could omit the quotes, it would help a lot here.
Let's start with keys, the main noise, anything that does not include :
should not be escaped.
>> require 'yaml'
=> true
>> {'a-b' => 1}.to_yaml
=> "---\na-b: 1\n"
>> {'a -b' => 1}.to_yaml
=> "---\na -b: 1\n"
>> {'a:-b' => 1}.to_yaml
=> "---\na:-b: 1\n"
>> {'a: -b' => 1}.to_yaml
=> "---\n'a: -b': 1\n"
>> {'a:b∂' => 1}.to_yaml
=> "---\na:b∂: 1\n"
... to play it super-safe could even do anything that does not only include [a-zA-z\d_-\.\/]
and that should cover the 99% case
... we can address values later if anyone cares
https://symfony.com/doc/current/components/yaml/yaml_format.html is a pretty good starting point for any future "perfect" implementation ... but let's just solve 99% of the problem with 1% of the effort ;)
No, this is not safe. Consider the following YAML:
no: 42
It is canonicalized to the following JSON:
{
"false": 42
}
Try it in https://yaml-online-parser.appspot.com/.
There's also a more subtle problem of things that have a different type when they're not escaped, e.g. true
, false
and numbers.
Since you mentioned ruby, please take a look at some special cases that they implement:
irb(main):010:0> {"no" => 42}.to_yaml
=> "---\n'no': 42\n"
irb(main):011:0> {"nope" => 42}.to_yaml
=> "---\nnope: 42\n"
irb(main):012:0> {"111" => 42}.to_yaml
=> "---\n'111': 42\n"
irb(main):013:0> {"111a" => 42}.to_yaml
=> "---\n111a: 42\n"
There are probably quite a few more gotchas like that, which I am not aware of. So it's not nearly as simple as you described. Someone needs to dive deeply in YAML spec and figure out what needs to be escaped.
can we use some yaml c bindings or whatever language it can interact with ?
Not really. We want all the behavior to be well-defined and the best way to ensure that is to have reference implementation in Jsonnet whenever possible.
In principle everyone should be able to take Jsonnet spec + stdlib code and produce a completely compatible implementation for any platform using any language (it's not 100% like that, but that's the ideal we are aiming for). And there are other existing Jsonnet implementations used in production (google/go-jsonnet, databricks/sjsonnet) and we want to keep them all compatible. We sometimes use external libraries, but we need to have a very good reason.
Someone needs to dive deeply in YAML spec and figure out what needs to be escaped.
Hopefully the work could be limited somewhat by considering only keys (not values), as @grosser suggested.
Also, it could be an opt-in behavior, via a new named parameter on std.manifestYamlStream
that defaults to false. E.g. prettyPrintKeys=false
. That way, users of Jsonnet get to choose whether they want the new behavior or not.
Hopefully the work could be limited somewhat by considering only keys (not values), as @grosser suggested.
I'm afraid it doesn't help much. Per YAML spec 7.4.2: "Note that YAML allows arbitrary nodes to be used as keys. In particular, a key may be a sequence or a mapping. " – so you still need to check if it's not going to be interpreted as any other YAML construct.
Also, it could be an opt-in behavior, via a new named parameter on std.manifestYamlStream that defaults to false. E.g. prettyPrintKeys=false. That way, users of Jsonnet get to choose whether they want the new behavior or not.
Yes, that would be a good idea.
Some related work: https://github.com/databricks/sjsonnet/pull/90. If we could this sort of processing as a Jsonnet library it would be a first step here.