black
black copied to clipboard
Fix internal magic commas not being force expanded
Description
Fix for issue #3268, #2052 As per the Black spec
The magic trailing comma
... However, there are cases where you put a short collection or function call in your code but you anticipate it will grow in the future. ... Now, you can communicate that you don’t want that by putting a trailing comma in the collection yourself. When you do, Black will know to always explode your collection into one item per line.
This means the current formatting output (the code below)
assert foo(1, 2, 3,)[
0
] == {'bar': 'baz'}
should instead be
assert foo(
1,
2,
3,
)[0] == {'bar': 'baz'}
Checklist - did you ...
- [ ] Add an entry in
CHANGES.md
if necessary? - [X] Add / update tests if necessary?
- [X] Add new / update outdated documentation?
Can someone verify that the test output is correct? I checked some simplified versions of the test and used that was my reference for what the output should be.
Does this overlap with #3370?
#3370 doesn't fix this issue, though some of the changes should make this easier. It probably has more overlap with #2052.
#2052 does seem to be the same problem, making #3268 a duplicate of it.
So I did some code coverage comparison and lines 207 and 208 in lines.py
seem to be behind the problem. https://github.com/psf/black/blob/8429ebc2eabdd3ec984505ccff265e692cb49f7d/src/black/lines.py#L195-L210
I ran two test minimal test cases with code coverage, shown below.
# failing
[1,][
2
](3)
# passing
[
1,
][2]()
The failing test did not run lines 207 and 208, while the passing test did. ~The issue seems to be that parentheses pairs that are empty do not cause the last leaf to be shifted back and for them to be marked as ignored.~ (mixed up the tests)
Some other test cases which more clearly show the issue are below.
# passing
[
1,
]()()()()
# passing regardless of which parentheses pair the 2 is in as long as there is only one
[
1,
](2)()()()
# failing if two or more of the entries are filled
[1,](
2
)(2)()()
Some other missed lines in the buggy test case are 116 and 216 in black/brackets.py
and 1162-1163 and 1227-1228 in black/linegen.py
. It's very interesting how the buggy test case has reduced coverage despite having a single extra character.
So a clear point where things go wrong is in black/brackets.py:116
. In the mark
function for the buggy test case, ,
never has a depth of 0, which means the comma is never marked as a delimiter with COMMA_PRIORITY
.
So the problem that I'm finding is that generate_trailers_to_omit
in linegen.py
is breaking early after having 2 non-empty pairs of brackets. This is why the test cases in https://github.com/psf/black/pull/3377#issuecomment-1311239582 fail.
# passing
[
1,
]()()()()
# passing regardless of which parentheses pair the 2 is in as long as there is only one
[
1,
](2)()()()
# failing if two or more of the entries are filled
[1,](
2
)(2)()()
I moved the yield
to the end of the loop, but this does cause a regression with trailing_comma_optional_parens2
. It also does not completely fix multiple internal magic commas. Some sample failing cases are below.
# input
[1,](3,)
# expected output
[
1,
](
3,
)
# actual output
[1,](
3,
)
So in general, in the case of several magic commas on the same line, unless there is not enough room on the line, only the rightmost one will be forcefully split. Is this intended behavior? The documentation suggests that Black will always expand the collection to fit, but this is clearly not the case.
diff-shades results comparing this PR (48042ee74472de912b66780f6d96844b6df48535) to main (27932494bcefac03497dd92dcf0c59a04c10d757). The full diff is available in the logs under the "Generate HTML diff report" step.
╭──────────────────────── Summary ─────────────────────────╮
│ 10 projects & 44 files changed / 385 changes [+203/-182] │
│ │
│ ... out of 2 333 569 lines, 10 748 files & 23 projects │
╰──────────────────────────────────────────────────────────╯
Differences found.
I looked at a few of the diff-shades changes and I'm unfortunately not a fan; they mostly seem like gratuitous changes that don't make the formatting better. Can this fix be limited to only cases where a magic trailing comma is present?
I'm not sure what changes to make at this point. One problem has to do with sibling-level magic trailing commas. In right_hand_split
, with something like [1,][3,]
, it comes out with head = "[1,]["
, body="3,"
, and tail="]"
. I think is is going to require a more significant rewrite to handle that possibility, and I'm still not familiar enough with the library to be comfortable doing that.
Another problem is that the "fix" that I made changes how long lines are handled by always returning all trailers found by generate_trailers_to_omit
, which is why there are regressions like below. The function seems to currently assume that any trailing magic commas will be found very close to the end. The placement of the yield omit
that I used had the least amount of regressions in the unit tests, but that isn't good enough.
def _is_venv_activated(self) -> bool:
- return bool(environ.get("POETRY_ACTIVE")) or getattr(
- sys, "real_prefix", sys.prefix
- ) == str(self.env.path)
+ return bool(
+ environ.get("POETRY_ACTIVE")
+ ) or getattr(sys, "real_prefix", sys.prefix) == str(self.env.path)