pyyaml icon indicating copy to clipboard operation
pyyaml copied to clipboard

Default to literal style for multiline strings

Open perlpunk opened this issue 6 months ago • 1 comments

Since many people complain about the defaulting to folded single quotes when a string has line breaks, I opened this PR for discussion.

  • #240

It will now output stinggs as literal if it has line breaks. Exceptions:

  • trailing spaces
  • special characters
  • string only contains line breaks (then it now uses double quotes and \n)
  • the maximum line length would be greater than the allowed number of columns left

The test suite passes for me locally.

See the output for some examples before and after:

Code
import yaml

strings = [
"""
1234567890 1234567890 1234567890 1234567890
1234567890 1234567890 1234567890 1234567890 1234567890
""",
"""
1234567890 1234567890 1234567890
1234567890 1234567890 1234567890
1234567890 1234567890 1234567890
""",
]
strings2 = strings.copy()
strings3 = [
"abc\n ",
"def\n",
"\nghi",
"\njkl\n",
"\n\n",
"\n \n",
" \n \n",
]
data=[
[[[[[[[[[[[[[[[[[[[[[strings]]]]]]]]]]]]]]]]]]]]],
strings2, strings3
]

out = yaml.dump(data)
print(out)
Output in current main branch
- - - - - - - - - - - - - - - - - - - - - - - '

                                              1234567890 1234567890 1234567890 1234567890

                                              1234567890 1234567890 1234567890 1234567890
                                              1234567890

                                              '
                                            - '

                                              1234567890 1234567890 1234567890

                                              1234567890 1234567890 1234567890

                                              1234567890 1234567890 1234567890

                                              '
- - '

    1234567890 1234567890 1234567890 1234567890

    1234567890 1234567890 1234567890 1234567890 1234567890

    '
  - '

    1234567890 1234567890 1234567890

    1234567890 1234567890 1234567890

    1234567890 1234567890 1234567890

    '
- - "abc\n "
  - 'def

    '
  - '

    ghi'
  - '

    jkl

    '
  - '


    '
  - "\n \n"
  - " \n \n"
Output in PR
- - - - - - - - - - - - - - - - - - - - - - - '

                                              1234567890 1234567890 1234567890 1234567890

                                              1234567890 1234567890 1234567890 1234567890
                                              1234567890

                                              '
                                            - |2

                                              1234567890 1234567890 1234567890
                                              1234567890 1234567890 1234567890
                                              1234567890 1234567890 1234567890
- - |2

    1234567890 1234567890 1234567890 1234567890
    1234567890 1234567890 1234567890 1234567890 1234567890
  - |2

    1234567890 1234567890 1234567890
    1234567890 1234567890 1234567890
    1234567890 1234567890 1234567890
- - "abc\n "
  - |
    def
  - |2-

    ghi
  - |2

    jkl
  - "\n\n"
  - "\n \n"
  - " \n \n"

For strings with multiple line breaks it now behaves like libyaml.

There are still differences to libyaml though.

Current PR:

- "abc\n "
- |
  def
- |2-

  ghi
- |2

  jkl

libyaml:

- "abc\n "
- "def\n"
- "\nghi"
- "\njkl\n"

So if there is only one line break, libyayml uses double quotes. If we want that too, we would have to add another attribute to the ScalarAnalysis class, I think.

perlpunk avatar Aug 10 '24 20:08 perlpunk