ruff
ruff copied to clipboard
Update line break heuristic when f-string has format specifier
Given:
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa {
aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc + dddddddd:.3f} cccccccccc"
Ruff (with --preview) would format is as:
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa {
aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc + dddddddd:.3f
} cccccccccc"
Here, we're changing the format specifier from .3f to .3f\n.
Now, if the user had the following code:
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa {
aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc + dddddddd:.3f
} cccccccccc"
Then, with the new heuristic to not have line breaks would mean that we'd format the above code as:
aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa {aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc + dddddddd:.3f
} cccccccccc"
The newline is already present in the format specifier .3f\n in the original source code.
Local or Global?
Another question is whether this heuristic should be applied globally or local to an individual expression element. For example, in the following code snippet,
f"""fooooooooooooooooooo barrrrrrrrrrrrrrrrrrr {
xxxxxxxxxxxxxxx:.3f} aaaaaaaaaaaaaaaaa { xxxxxxx } bbbbbbbbbbbb {
xxxxxxxxxxx + xxxxxxxxxxx } end"""
We've an expression element which has a line break and doesn't contain a format specifier (third field). This means that the f-string layout is multiline. So, the first element cannot have line breaks while the second and third can have. If so, then we'd potentially format it as:
f"""fooooooooooooooooooo barrrrrrrrrrrrrrrrrrr {xxxxxxxxxxxxxxx:.3f} aaaaaaaaaaaaaaaaa {xxxxxxx} bbbbbbbbbbbb {
xxxxxxxxxxx + xxxxxxxxxxx
} end"""
If we apply the heuristic globally, then the formatter would collapse all of the expression elements.
Playground: https://play.ruff.rs/c2d0cc39-30ab-4c17-ac8c-0f4d8a9afbb6
Thanks for opening this as an issue. Is this something you plan to work on?
Yeah, although not right away. It should be a simple fix.
I think with https://github.com/astral-sh/ruff/pull/7787, this issue isn't relevant anymore because both unformatted and formatted code produces the same AST as the lexer would stop at the newline token.
Right, but only for single-quoted string. It's still relevant for triple-quoted string.