js-yaml
js-yaml copied to clipboard
Literal block style multilines don't survive round trip; can it be forced?
When using literal multiline block strings inside my YAML file, converting to JS, and then back, sometimes the multiline literal block survives, but not always.
Is there a way to force a particular behavior? Having them be literal block strings inside the YAML is, IMHO, definitely more readable and editable by humans.
An example program:
const yaml = require('js-yaml')
const bad = `
abc:
- name: xxx
sample: |
{
"abc": 2,
"def" : {
"efg" : {
"ok" : 6666,
"nope" : "./yyy",
`.trim() // make sure the whitespace is not the difference
const good = `
abc:
- name: xxx
sample: |
{
"abc": 2,
"def" : {
`.trim() // make sure the whitespace is not the difference
const badDoc = yaml.safeLoad(bad)
const goodDoc = yaml.safeLoad(good)
const goodDumped = yaml.safeDump(goodDoc)
if( goodDumped === good ) {
console.log( "Round trip was good for GOOD file\n" )
}
const badDumped = yaml.safeDump(badDoc).trim()
if( badDumped !== bad ) {
console.log( "Round trip was bad for BAD file" )
console.log( "INPUT: ", bad, "\n\n" )
console.log( "OUTPUT: ", badDumped,"\n\n" )
}
else {
console.log( badDumped, bad )
}
The output of this is:
Round trip was good for GOOD file
Round trip was bad for BAD file
INPUT: abc:
- name: xxx
sample: |
{
"abc": 2,
"def" : {
"efg" : {
"ok" : 6666,
"nope" : "./yyy",
OUTPUT: abc:
- name: xxx
sample: "{\n \"abc\": 2,\n \"def\" : {\n \"efg\" : {\n\t\"ok\" : 6666,\n\t\"nope\" : \"./yyy\",\n"
Note that sample has been turned into a string with newlines encoded, rather than a literal multiline block.
It seems strange that the shorter version (good) does preserve literal block style, but the longer one does not. Am I missing something subtle about using the library?
(IMHO, it would seem more usable to have shorter strings drop the literal multiline block, while longer ones always use the | format, but the library works the opposite way.)
I would really like a way to always have multiline values (any strings with newlines) to be output as literal multiline block strings so that I have the option to edit them by hand easily.
I'm using version "js-yaml": "^3.12.1"
No option to force style. There are only some logic to select the most suitable automatically. May be, it can be improved.
I'm happy to help with a PR if you give me some guidance on where to look @puzrin . Thanks for looking at the issue.
Somewhere here. This part was done not by me, i don't remember details.
Interesting comment here:
// Also prefer folding a super-long line.
https://github.com/nodeca/js-yaml/blob/master/lib/js-yaml/dumper.js#L301
I wonder if this means the person prefers folding (collapsing) a long line. I prefer the opposite behavior, if I understand what they mean.
Perhaps related: I'm seeing SOME |
-style block strings dumped as >
, which will ruin the spaces when read back again. But not both.
# input
---
mode: citation
input:
- id: ITEM-1
type: book
result: |
a,d,e,f
csl: |
<?xml version="1.0" encoding="utf-8"?>
<style xmlns="http://purl.org/net/xbiblio/csl" class="note" version="1.0.1" default-locale="en-US">
<info><id>https://cormacrelf.net/citeproc-rs/test-style</id><title>test-style</title></info>
<macro name="Inner">
<text value="d" />
<text value="e" />
</macro>
<citation>
<layout>
<group delimiter="," >
<text value="a" />
<text macro="Inner" />
<text value="f" />
</group>
</layout>
</citation>
</style>
# yaml.safeDump(yaml.safeLoad(...input...))
---
mode: citation
input:
- id: ITEM-1
type: book
result: |
a,d,e,f
csl: >
<?xml version="1.0" encoding="utf-8"?>
<style xmlns="http://purl.org/net/xbiblio/csl" class="note" version="1.0.1"
default-locale="en-US">
<info><id>https://cormacrelf.net/citeproc-rs/test-style</id><title>test-style</title></info>
<macro name="Inner">
<text value="d" />
<text value="e" />
</macro>
<citation>
<layout>
<group delimiter="," >
<text value="a" />
<text macro="Inner" />
<text value="f" />
</group>
</layout>
</citation>
</style>
I made a pull request for a change that adds a flag that makes it possible to force literal block style.
// Also prefer folding a super-long line.
This comment does not make functional sense. Anyone using this library to generate YAML containing multi-line blocks with white space sensitive content that goes over an arbitrary line length will get caught on this (which is increasingly common given the pattern of inlining resources in Kubernetes objects such as ConfigMaps). I think it makes more sense for the code to attempt to honor the contents of the string, which would essentially always be |
.
Seems like is pass lineWidth to be -1 will set block style to be STYLE_LITERAL https://github.com/nodeca/js-yaml/blob/d6983dd4291849b2854e8d26e1beb302edfd4c76/lib/js-yaml/dumper.js#L274
It seems strange that the shorter version (good) does preserve literal block style, but the longer one does not. Am I missing something subtle about using the library?
You are missing something subtle about your document. There are tabs (\t
, 0x09) in the longer version, but not in the shorter version. That's the difference.
Whether tabs should or should not be in output document is an interesting question. But adding a separate option just for these feels wrong.
Could it be that js-yaml adds these \t somehow, because I have the same problem of having some strings (that did not change) basically randomly switch between "
and |