yapf icon indicating copy to clipboard operation
yapf copied to clipboard

indent_dictionary_value:True breaks lines even when values fit on the same line

Open IamJeffG opened this issue 6 years ago • 8 comments

The knob documentations says indent_dictionary_value:True will "indent the dictionary value if it cannot fit on the same line as the dictionary key". However I see cases where this option creates terrible formatting even if every line of my input is already less than column_limit.

Stranger still, this new (bad) splits are not even between dictionary key and value, as the knob would have suggested.

nested-list.py

_schema = [
    ('uuid', {'field_type': 'STRING'}),
    ('ingredient_line', {'field_type': 'STRING'}),
    ('item_name', {'field_type': 'STRING'}),
    ('query', {'field_type': 'STRING'}),
    ('item_name_seq', {'field_type': 'INTEGER', 'mode': 'REPEATED'}),
    ('query_seq', {'field_type': 'INTEGER', 'mode': 'REPEATED'}),
    ('relevance', {'field_type': 'FLOAT'}),
    ('fold', {'field_type': 'INTEGER'}),
    ('params', {'field_type': 'STRING'}),
    ('timestamp', {'field_type': 'DATETIME'})
]

To reproduce

yapf --style '{indent_dictionary_value:True}' --diff nested-list.py 
--- nested-list.py	(original)
+++ nested-list.py	(reformatted)
@@ -1,12 +1,23 @@
-_schema = [
-    ('uuid', {'field_type': 'STRING'}),
-    ('ingredient_line', {'field_type': 'STRING'}),
-    ('item_name', {'field_type': 'STRING'}),
-    ('query', {'field_type': 'STRING'}),
-    ('item_name_seq', {'field_type': 'INTEGER', 'mode': 'REPEATED'}),
-    ('query_seq', {'field_type': 'INTEGER', 'mode': 'REPEATED'}),
-    ('relevance', {'field_type': 'FLOAT'}),
-    ('fold', {'field_type': 'INTEGER'}),
-    ('params', {'field_type': 'STRING'}),
-    ('timestamp', {'field_type': 'DATETIME'})
-]
+_schema = [('uuid', {
+    'field_type': 'STRING'
+}), ('ingredient_line', {
+    'field_type': 'STRING'
+}), ('item_name', {
+    'field_type': 'STRING'
+}), ('query', {
+    'field_type': 'STRING'
+}), ('item_name_seq', {
+    'field_type': 'INTEGER',
+    'mode': 'REPEATED'
+}), ('query_seq', {
+    'field_type': 'INTEGER',
+    'mode': 'REPEATED'
+}), ('relevance', {
+    'field_type': 'FLOAT'
+}), ('fold', {
+    'field_type': 'INTEGER'
+}), ('params', {
+    'field_type': 'STRING'
+}), ('timestamp', {
+    'field_type': 'DATETIME'
+})]

My expectation:

  • [ ] indent_dictionary_value:True does not reformat lines that already fit on one line
  • [ ] indent_dictionary_value:True does not introduce new splits that aren't between dict key and value.

Note that if I keep the default of indent_dictionary_value:True, the yapf (correctly) does not reformat my code. (My motivation for forcing indent_dictionary_value=True is to workaround #392.)

IamJeffG avatar Mar 29 '18 18:03 IamJeffG

Checking this out now. Is there a default column_limit?

See now that it's 80.

jbrill avatar Apr 01 '18 22:04 jbrill

It's technically correct (the best kind of correct!). The value of each dictionary entry is on the same line as the key. What's happening is that the dictionary itself has each element on a separate line.

What's happening is that each element in the list should be on a single line. But YAPF is changing it because it likes to place dictionary elements on individual lines.

bwendling avatar Apr 10 '18 05:04 bwendling

YAPF is changing it because it likes to place dictionary elements on individual lines

If that's the case, then indent_dictionary_value should have nothing to do with it. Yet the value of this knob does change the formatting: for some reason YAPF no longer likes to place dictionary elements on different lines when indent_dictionary_value:False. IMO this is still a bug for that reason.

In any event, what causes YAPF to prefer each dictionary element on its own line? I'm looking for a workaround, so is there maybe some penalty that can adjust this preference? (besides indent_dictionary_value, that is.)

IamJeffG avatar Apr 11 '18 05:04 IamJeffG

@IamJeffG I've been playing around with a few different tests to see how YAPF changes the code when indent_dictionary_value is switched to true and false. As you most likely saw in #392, YAPF will place values on separate lines from their respective keys when there is at least one key/value pair that exceeds the column limit. I'm currently looking into how YAPF can avoid splitting each line for this case.

However, for your certain test case above, I believe the errors occurring here are due to the special characters. YAPF is, as a result, misinterpreting the key and value pairs from my understanding. Are there other test cases that you've found where "YAPF no longer likes to place dictionary elements on different lines when indent_dictionary_value:False"? If so, could you post them here please to help me better understand the issue?

mnsaab avatar Apr 13 '18 03:04 mnsaab

If you look at the section in MustSplit() that starts with EACH_DICT_ENTRY_ON_SEPARATE_LINE, you'll find where some of this logic is being calculated I think. As you can see, the logic there is pretty no-so-great and can possibly be modified to handle situations like the one above plus others. It might also be possible to improve the split penalty calculations, though that would be at another place in the code and not something I've tried...

bwendling avatar Apr 13 '18 05:04 bwendling

Are there other test cases that you've found where "YAPF no longer likes to place dictionary elements on different lines when indent_dictionary_value:False"?

Just to be clear, I'm not saying it's a bug to keep dictionary elements on the same line. I think the bug is that this behavior changes due to an unrelated knob.

Anyway, to directly answer your question, I never use indent_dictionary_value:False due to #392 so wouldn't have noticed many examples for you here. Here is a counterexample of your note above that "YAPF will place values on separate lines from their respective keys when there is at least one key/value pair that exceeds the column limit", but I think this observation is more poignant to #392 than to this issue:

$ yapf --style '{column_limit:80, indent_dictionary_value:False}'  foo.py 
factor = api.model(
    'Conversion factor', {
        'goat': fields.Float(description='Mauris imperdiet, neque eu maximus.'),
        'armadillo':
        fields.Float(description='Etiam consequat libero nec neque.'),
        'orangutan': fields.Float(description='Nam accumsan velit, id.'),
        'cheetah':
        fields.Float(description='Aenean tempor fermentum justo vitae.')
    })

(Notice that the 2nd and 4th key-values would exceed the column limit, yet the 1st and 3rd key-values are not split!)

More relevant to this issue, here's another example where indent_dictionary_value determines more than only whether the value is indented, but also how the nested literal structures are split (I assume this is what you meant by "special characters"):

$ yapf --style '{indent_dictionary_value:False}' bar.py 
PLAYERS = {
    'Golden State Warriers':
    ['thompson', 'looney', 'green', 'mccaw', 'jones', 'curry']
}

$ yapf --style '{indent_dictionary_value:True}' bar.py 
PLAYERS = {
    'Golden State Warriers': [
        'thompson', 'looney', 'green', 'mccaw', 'jones', 'curry'
    ]
}

(In this case, I would not expect the knob to affect whether to split after the opening square bracket)

IamJeffG avatar Apr 16 '18 03:04 IamJeffG

I just posted a comment in another issue which I think fully reflects this problem --> https://github.com/google/yapf/issues/623#issuecomment-871471510

All the lines fit within my specified line length, but Yapf keeps moving that opening curly brace to a new line.

LECbg avatar Jun 30 '21 15:06 LECbg

Worked around this by setting ALLOW_SPLIT_BEFORE_DICT_VALUE to False.

slw07g avatar Jun 23 '22 05:06 slw07g