pyyaml
pyyaml copied to clipboard
Incorrect indentation with lists
When using indentation this seems to be applied to the value of the list instead of to the list itself, as you can see below indent=4 is applied after the leading - and not to the list itself.
>>> print(yaml.dump(data['vars']['yaml'], indent=4, allow_unicode=True, default_flow_style=False))
list_of_dict_attr:
- attr1: value1
attr2: value2
attr3:
- item1
- item2
single_attr: value1
>>> print(yaml.dump(data['vars']['yaml'], indent=2, allow_unicode=True, default_flow_style=False))
list_of_dict_attr:
- attr1: value1
attr2: value2
attr3:
- item1
- item2
single_attr: value1
original issue https://github.com/ansible/ansible/issues/48865
@bcoca can you please check if this is in the Python emitter code or if it's libyaml (ie CDumper) (or both)?
@ingydotnet Testing with Dumper=yaml.Dumper and Dumper=yaml.CDumper seems to produce the same result.
Thanks @sivel. This will need to be fixed in both. Patches welcome :)
It seems like the problem is occurring because self.indention is set to False in Emitter when expect_block_sequence is run, which makes the self.increase_indent code do nothing.
@zdog234 i've tried myself the below and didn't make any difference
def expect_block_sequence(self):
indentless = (self.mapping_context and not self.indention)
self.increase_indent(flow=True, indentless=indentless)
self.state = self.expect_first_block_sequence_item
or
def expect_block_sequence(self):
indentless = (self.mapping_context and not self.indention)
self.increase_indent(flow=True)
self.state = self.expect_first_block_sequence_item
@ingydotnet @perlpunk any chance you can shed some light and i may be able to fire a PR ?
Well, I wouldn't call that behaviour incorrect. I guess it's a matter of taste, and I can find arguments proving that it's consistent. What I'm also missing in this issue is the expected correct behaviour.
Let's look at both examples:
--- # spaces = 4
list_of_dict_attr:
- attr1: value1
attr2: value2
attr3:
- item1
- item2
single_attr: value1
--- # spaces = 2
list_of_dict_attr:
- attr1: value1
attr2: value2
attr3:
- item1
- item2
single_attr: value1
The top level mapping has an indentation of zero (0 * spaces).
The value for list_of_dict_attr, the sequence, also has an indentation of zero, because PyYAML chooses zero-indented sequences always. That's why the dashes have no indentation in both cases. If it chooses zero-indentation, it simply does not depend on the number of spaces you configured.
The value of the first sequence item, the mapping attr1: ..., has an indentation of 1 * spaces (respectively 4 or 2).
The sequence under attr3 is zero-indented again, so 1 * spaces. The items of this sequence are on the same line, so they don't get any indentation.
I assume you would expect this instead?
--- # spaces = 4
list_of_dict_attr:
- attr1: value1
attr2: value2
attr3:
- item1
- item2
single_attr: value1
I can't speak for @DanyC97 but in my opinion yes, that last snippet is what is expected when passing spaces=4.
From my point of view, the most widely-accepted indentation style for sequences is the one used multiple times in the official YAML specification. For instance, in section 2.1, example 2.3 looks like this:
american:
- Boston Red Sox
- Detroit Tigers
- New York Yankees
national:
- New York Mets
- Chicago Cubs
- Atlanta Braves
The question is whether tools like pyyaml should render sequences in such a way for indentation of size 4 or for indentation of size 2.
I would argue that it seems incorrect to render sequences in such a way for indentation of size 4, because other items would visually appear to be indented more:
mapping:
one: 1
two: 2
list:
- 1
- 2
Therefore, I think that it is more appropriate to render sequences in such a way for indentation of size 2:
mapping:
one: 1
two: 2
list:
- 1
- 2
That being said, someone may prefer to not indent sequence items to a level that is visually similar to the indentation level of the other items. That is a fair requirement, but in order to fully support it, there would have to be a separate configuration option for indentation size of sequences.
I agree with @pbasista about what the output should look like, that'll be the same behavior that yamllint using, and maybe a separate configuration option would be a solution for both people want/like it or not.
I'm currently facing yaml file generated by pyyaml not being accepted by yamllint because of the indent of the lists.
there would have to be a separate configuration option for indentation size of sequences.
From my point of view that would be the best, because IMHO currently block sequences are simple not indented at all (at least when they are a value of a block mapping), independent of the indent option.
Is there any progress on this? Right now the workaround here is working for me https://stackoverflow.com/questions/25108581/python-yaml-dump-bad-indentation
It seems the spec varies the output. The Preview section shows sequences indented from the key.
Example 2.3. Mapping Scalars to Sequences (ball clubs in each league)
american: - Boston Red Sox - Detroit Tigers - New York Yankees national: - New York Mets - Chicago Cubs - Atlanta Braves
However the Failsafe Schema is indeed what pyyaml is doing:
10.1. Failsafe Schema The failsafe schema is guaranteed to work with any YAML document. It is therefore the recommended schema for generic YAML tools. A YAML processor should therefore support this schema, at least as an option. ... 10.1.1.2. Generic Sequence URI:
tag:yaml.org,2002:seqKind: Sequence.
Definition: Represents a collection indexed by sequential integers starting with zero. Example bindings to native types include Perl’s array, Python’s list or tuple, and Java’s array or Vector.
Example 10.2. !!seq Examples
Block style: !!seq - Clark Evans - Ingy döt Net - Oren Ben-Kiki Flow style: !!seq [ Clark Evans, Ingy döt Net, Oren Ben-Kiki ]
Personally I prefer the indented format and it would be nice if pyyaml supported it as an option but the code isn't doing anything wrong without the indents even if yamllint disagrees.
Just ran into this myself in some example config generation I am doing, and it makes some of my exports rather weird.
The indented list structure does seem more common, and it would be nice if pyyaml supported it.
Edit: I've now also temporarily solved it with https://stackoverflow.com/questions/25108581/python-yaml-dump-bad-indentation.
Also running into this via ansible. It would be nice if the indentation were consistent.
The workaround mentioned above:
class Dumper(yaml.Dumper):
def increase_indent(self, flow=False, *args, **kwargs):
return super().increase_indent(flow=flow, indentless=False)
print(yaml.dump(data, Dumper=Dumper))
Gateways:
- 14
- 4
- 18
I just use prettier as a pre-commit hook and it takes care of making yaml look good.
The workaround mentioned above:
class Dumper(yaml.Dumper): def increase_indent(self, flow=False, *args, **kwargs): return super().increase_indent(flow=flow, indentless=False) print(yaml.dump(data, Dumper=Dumper))
Unfortunately, it does not works with CDumper. And when working with a lots of yaml or with big yaml, I'd rather have no indentation than having a "good looking" one but much much slower generator.
I just use prettier as a pre-commit hook and it takes care of making yaml look good.
I tried using prettier, but it made changes that both I and YamlLint disagreed with.
another year 2022 coming, is there an easy way to resolve this issue?
another year 2022 coming, is there an easy way to resolve this issue?
https://github.com/yaml/pyyaml/issues/234#issuecomment-765894586
This worked for me. It is quite easy to implement also.
Sill not work, will this be fixed?
my code
def test_demo(self):
data = {
"name": "John",
"age": 30,
"city": "New York",
"haha": ["aaaa", "bbbb"]
}
# 设置 indent 和 default_flow_style 参数
output = yaml.dump(data, Dumper=Dumper, sort_keys=False,indent=2)
print(output)
result is
age: 30
city: New York
haha:
- aaaa
- bbbb
what I expected is
age: 30
city: New York
haha:
- aaaa
- bbbb
Found a temp solution
https://stackoverflow.com/a/39681672/4037224
@Acidherr
This worked for me. It is quite easy to implement also.
What in the words indent=4 is so hard to understand?
No it doesn't work with indent=4, in 2023.
@pkit I know you're frustrated about the slow progress on this issue - many of us are - but please do not take your frustration out on your fellow commenters and contributors. By doing so you reduce trust and diminish the quality of all open source projects related to this one.