black
black copied to clipboard
Add linebreak in comprehension with complex element expression
To me Black's heuristics are too eager to wrap comprehensions up into one line.
Do you think:
(Output from Black)
new_values = [
left if left < right else right for left, right in values if 0 <= right
]
is easier to skim-read than:
new_values = [
left
if left < right else
right
for left, right in values
if 0 <= right
]
I can change the code to have really long names to break Black's heuristics. And Black outputs a much easier to read format. The only problem is the names I've picked to get the nice readable format are way less readable. So now I've traded readability for readability.
new_values = [
a_super_long_name
if a_super_long_name < b_super_long_name else
b_super_long_name
for a_super_long_name, b_super_long_name in values
if 0 <= b_super_long_name
]
(Output from Black)
new_values = [
a_super_long_name
if a_super_long_name < b_super_long_name
else b_super_long_name
for a_super_long_name, b_super_long_name in values
if 0 <= b_super_long_name
]
How can I up the heuristics so much so I can get the following output from Black?
new_values = [
value
for value in values
]
I'd much rather always have the above, than have complex comprehensions go on one line.
Just use # fmt: off
and # fmt: on
to disable formatting for given code block. Fitting to the single line if possible imho definitely should remain a standard approach.
I tend to agree that if there's an else
statement in a comprehension, then exploding it usually always increases readability.
Maybe there's potential for a formatting rule there?
@jaklan Using # fmt: on/off
is not a reasonable solution for a couple of reasons.
-
I want black to format the inner parts of the comprehension.
Say someone uses
'
rather than"
; I'd want black to fix that. (I do at times, still converting to"
) However the binary nature of# fmt: on/off
will mean black won't fix issues with the code. Additionally we can see me and black have different opinions on how to format theelse
. When changing to black's style I will make mistakes inside the# fmt: off
which black should fix. -
Knowing which comprehensions to
# fmt: off
is non-obvious.I'd not have thought a 7 line comprehension would be wrapped up into one line. As such I'm going to over use
# fmt: off
and handicap black leading to ¶1. -
Converting to black is a needlessly large effort.
In some projects I have lots of comprehensions. To use black and keep my code readable I would have to manually go through the entire of my code and add
# fmt: off
everywhere. -
Converting my code so black only runs on 50% of the code is a waste.
If I'm disabling black to the point where 50% of my code isn't being fixed, then what's the point in using black at all? I'd have to spend so much time adding
# fmt: off
and I'd still have to run other tools and manually fix any errors in my comprehensions.
Upping black's heuristics would fix all of the issues with # fmt: off
. Which is why I asked:
How can I up the heuristics so much so I can get the following output from Black?
@Peilonrayz You simply can't. Black is about compromises driven by the will of the majority / maintainers, not about fitting needs of each individual. You have a different definition of readability than black? So just use yapf and configure it as you wish.
Please let the maintainers of black speak on behalf of black.
So please mention people who have your blessing to answer the questions if you don't want to get them from the community - no need to waste time then.
I agree with breaking apart long line list comprehensions would be a nice to have. But, I feel two main things are stopping us from considering this:
- I don’t think the code in order to do this sanely would be simple. I refer to the difficulty we’ve had with the
magic-trailing-comma
feature. - It would be very disruptive and probably upset a lot of black users (If we can somehow see if majority of users would welcome this, that would elevate this speculation)
I believe the logic today is all just based on line lengths and nothing more. I would welcome proof of concept PRs to address point 1 and see if the code is complex and how many edge cases we uncover. This does not guarantee merging tho.
I've not read black's code but I can think of two simpler ways to add support in black:
-
Multiply the character length black sees for the comprehension by a factor of n. With a factor of 5 a 10 character comprehension would be seen as 50 characters. 5 is just an example, I can't imagine many people would want more than 2.
-
Have a separate line length limit for comprehensions.
For example if we set the length limit to 20 characters the following would be fine:
my_super_long_name = [x for x in values]
I think special casing turneries like hukkinj1 suggested would probably be pretty hard to implement.
However a better method more inline with black's ethos may exist. @cooperlees do you have any opinions on which direction I should set out on an example PR?
To provide another opinion, I'd like to see this feature too and I mostly agree. But not all the way.
# This I like, and having "else" at the beginning of the line
# rather than at the end of the previous is better in my opinion
# (meaning I'd also like to see this with the shorter names)
new_values = [
a_super_long_name
if a_super_long_name < b_super_long_name
else b_super_long_name
for a_super_long_name, b_super_long_name in values
if 0 <= b_super_long_name
]
# This is too much, as it's very easy to read on one line
new_values = [
value
for value in values
]
So to me it's a question of complexity. Going beyond just comprehensions, this could be reformatted as well:
some_value = (
value
if cond
else another_value
)
but I don't think it should. At least when not speaking about the implementation of Black, I could see a way of formulating a complexity heuristic for an expression, which could be used to explode statements that don't exceed the line length limit. For example, "explode everything that has two or more nested statements":
[i for i in range(3)]
[
i
for i in range(3)
if i < 2 # Debatable whether this is part of "for" or not but anyways..
]
(a if c else b)
(
a
if c1
else b
if c2
else c
)
But I'm not in the know of Black internals, nor am I aware of all the complexities of these rules out in the wild. Thoughts?
In #2841, a set of criteria for this kind of splitting was proposed. I argued (and presented some examples) that they were too loose, meaning they split too many simple statements, but it could be a good starting point. They were:
- It has more than one attribute access in any single term of the overall expression
- It has more than one operator in any single term of the overall expression
- A single term in the expression contains both an attribute access and an operator
- The entire expression has a length exceeding 40 chars
- It has an
if
clause - It has more than 1
for
clause
Let's continue hashing this out!
Given the age on this, I have little hope it will be addressed.
List comprehensions are one of only 3 major issues I have with black. In most cases there isn't any good reason to modify anything, and doing so nearly always makes it worse.
It feels like comprehensions are treated fundamentally wrong 🤷 but I'm having a hard time describing it. They are their own "thing", but the formatter treats them as an expression to squash - but they are a loop and conditional and follow their own rule(s). There is no good reason to force them to a single line.
The first multi-line (as written) is readable, but the black squashed version is awful.
results = [
self.generate_result_item(f"B{x}", size="XL")
for x in range(100) if x %2 == 0
]
results = [self.generate_result_item(f"B{x}", size="XL") for x in range(100) if x % 2 == 0]
Adding the magic comma results in a slightly better version. That works but isn't what was desired. If it can avoid collapsing on one it should be fine on the other too. The expanded if
isn't desired, but at least its consistent.
results = [
self.generate_result_item(
f"B{x}",
size="XL",
)
for x in range(100)
if x %2 == 0
]
I hope not, but comprehensions might just be too complicated and varied to format well deterministically, just like SQL.
@felix-hilden Is there still interest in a fix for this issue?
I appreciate that it will be difficult to come up with a good set of heuristics for formatting comprehensions along the lines of those you list. I think this is an important issue for readability, so I'm happy to try to come up with better heuristics if there's a chance they might get merged.
Alternatively, I have the following thought.
Would it be possible to add a flag to black to stop black from removing line breaks from comprehensions? It seems the problem is always that black collapses lines in comprehensions that were separated for readability.
When such a flag is passed, black would still be permitted to add line breaks, to meet it's line length requirement and other requirements. To some degree, that flag would provide the best of both worlds, and I think it would make it much easier to use black for those of us with complex comprehensions in our code.
Thank you for the conversation! To add another aspect, black formatted a list comprehension in a surprising — and for me disagreeable — manner:
bla = [
{
"multiple": "entries",
"in": {"this: "dictionary"},
}
for a, b, c in some_object.do_stuff(
arg1, arg2, arg3
)
]
whereas both for
and in
would have fitted nicely on their own lines:
bla = [
{
"multiple": "entries",
"in": {"this: "dictionary"},
}
for a, b, c
in some_object.do_stuff(arg1, arg2, arg3)
]
For me the second version is quite a bit more readable.
@nsfinkelstein I like the sound of your suggestions - just don't edit the newlines in a comprehension outside of function calls embedded within it.
If the temptation to reformat can be overcome, I think this would solve all potential issues:
if expression is ListComprehension:
raise SkipReformatting()
@awbacker Thanks. Just to be clear, that's not exactly the suggestion. I'm suggesting that black still be permitted to add linebreaks in a comprehension (list or otherwise) to meet its other requirements, but that it no longer be permitted to remove them.
@noam-delfina I see. A little disagreement then, I don't think black should format them at all other than spacing. If you find yourself in a hole, first rule is to stop digging!