Use `ruff-format`
ruff-format is more thorough in the formatting. This would change 109 files, so I am not committing them yet, ~~want to ask about autopep8 first.~~
Configuration settings: https://docs.astral.sh/ruff/settings/#format
Pretty sure @happz would very much not like this as ruff formatter is made to be drop-in replacement for black, not autopep8.
Ok, I've reverted the pep8 part
Pretty sure @happz would very much not like this as ruff formatter is made to be drop-in replacement for black, not autopep8.
Indeed, I for one was never fond of Black and opinions forming its formatting style. Mostly because of the decisions that led to allow very to little customization, leaving no space for project-specific formatting choices. I find that impractical, and one bad way of doing things is forcing everyone to do it in the same way.
ruff-format seems to offer some controls, and I briefly scanned the pre-commit report - it does not look as bad as I expected and remembered from my last experience with Black. I would like to read it properly to find out something I could complain about with having some hard examples :)
Two things I can share already:
- I would advocate using the
# fmt: offpragma, as there are pieces of code - like step data fields or test parameters - where the formatter undoes careful manual formatting that aimed at improving the readability of less than trivial structures. - I for one would 100 % recommend not reformatting the whole code base at once, in a single patch. It's extremely tedious to review it, and it's easy to get lost among literally thousands of lines. At least for me, that is. Should powers that be decide "yep, we want this", I'm sure we could come up with some per-package or per-module approach that would be easier to digest, and attract attention to - not a single aspiring "junior" developer would dive into reviewing a patch of thousands of lines, but they would be happily learning about tmt code reviewing changes to a plugin or two.
Just my two cents..
not reformatting the whole code base at once
format.quote-style = "preserve" would drastically lower the number of lines and files to review.
Putting aside the codestyle/formatting preferences, using ruff formatter would have some benefits, like speed, integration with linter rules, incl. config and ... speed. I mean, not just considerably faster to check/format, but also by "reusing" the same ruff pre-commit virtualenv as the lint step.
Added to our meeting agenda, let's get some input.
Could we tweak the option to remove space around the docstring?
- """ Add a string to the paragraph being rendered """
+ """Add a string to the paragraph being rendered"""
Would be nice to keep this formatting as it is now.
Could we tweak the option to remove space around the docstring?
- """ Add a string to the paragraph being rendered """ + """Add a string to the paragraph being rendered"""Would be nice to keep this formatting as it is now.
hmm, I scoured the internet, but haven't found the surrounding spaces in any standard or convention. We have https://docs.astral.sh/ruff/rules/surrounding-whitespace/ disabled - not sure if the formatter would ever respect that, aiming to be compatible with black.
I've rebased and updated the config a little bit, for people to see what changes there would be. It's mostly the surrounding spaces and trailing braces :)
https://github.com/teemtee/tmt/compare/main...martinhoyer:tmt:ruff-format
It kinda sounds like a bug in ruff upstream, does still occur with your updated version? Also feel free to force push the update, or I could rebase and cherry-pick?
feel free to force push
:saluting_face:
sounds like a bug in ruff upstream
I'm not sure. There would have to be a setting for having "allowing/preferring" this style. We are only ignoring that pydocstyle lint check.
Briefly went through the changes. While there are a lot of theme, I think they all make sense... Bumped into this though, looking at PEP8, it doesn't seem to be correct?
- unique_suffix = name[len(INJECT_CREDENTIALS_URL_PREFIX):]
+ unique_suffix = name[len(INJECT_CREDENTIALS_URL_PREFIX) :]
We got a reply from upstream: https://github.com/astral-sh/ruff/issues/13760#issuecomment-2413874694
@psss what do you think about this limitation? I.e.
there's no option to change that behavior because we intentionally limit the formatter options and only support formatting styles with widespread adoption.
@psss what do you think about this limitation? I.e.
there's no option to change that behavior because we intentionally limit the formatter options and only support formatting styles with widespread adoption.
I see. Hm, pity. I find the formatting with spaces a bit more readable. Feels like giving some freedom to the message helps to more quickly catch the words rather then when the first/last word is squeezed into the triple quotes. But I'm not going to fight for this. Others have any thoughts? Maybe I'm just the only one, so no need to block on this. Does ruff-format support one line docstrings like this?
def wake(self) -> None:
"""
Wake up the guest
"""
self.debug(f"Waking up container '{self.container}'.", level=2, shift=0)
We could have a single rule: Both starting and ending triple-quotes are on a separate line. That's it. I see this approach is used quite widely in the artemis code. Perhaps @happz will have some input on this?
It's mostly the surrounding spaces and trailing braces :)
The trailing braces part will be hard. Does ruff-format support hang-closing feature of autopep8? I really cannot understand at all how closing brackets and parenthesis can be placed out of the indented block???!!! :grin: That seems to be going against the main Python principle: Anything which belongs "under" the current line is indented. Want to find where the block ends? Look at the first line with the same or lower indent. That's it. So simple! :nerd_face: If this rule is broken I'm very much afraid the end of the world is near! 😵
One more thing: Among the changes I see also these:
dynamic_ref = resolve_dynamic_ref(
- workdir=destination,
- ref=plan_id.ref,
- plan=self,
- logger=self._logger
- )
+ workdir=destination, ref=plan_id.ref, plan=self, logger=self._logger
+)
I would say, in case cases, it's really more suitable to write each parameter on a single line. Does ruff-format always enforce single line when it's short enough or is this somehow configurable?
One more thing: Among the changes I see also these:
dynamic_ref = resolve_dynamic_ref( - workdir=destination, - ref=plan_id.ref, - plan=self, - logger=self._logger - ) + workdir=destination, ref=plan_id.ref, plan=self, logger=self._logger +)I would say, in case cases, it's really more suitable to write each parameter on a single line. Does
ruff-formatalways enforce single line when it's short enough or is this somehow configurable?
Agreed, at some point of N arguments, it's easier for me to follow them if they are each on its own line, even when they would fit on one line nicely. Maybe there is a control for this, like 4 or more arguments? Put them on new lines!
@psss what do you think about this limitation? I.e.
there's no option to change that behavior because we intentionally limit the formatter options and only support formatting styles with widespread adoption.
I see. Hm, pity. I find the formatting with spaces a bit more readable. Feels like giving some freedom to the message helps to more quickly catch the words rather then when the first/last word is squeezed into the triple quotes. But I'm not going to fight for this. Others have any thoughts? Maybe I'm just the only one, so no need to block on this. Does
ruff-formatsupport one line docstrings like this?def wake(self) -> None: """ Wake up the guest """ self.debug(f"Waking up container '{self.container}'.", level=2, shift=0)We could have a single rule: Both starting and ending triple-quotes are on a separate line. That's it. I see this approach is used quite widely in the
artemiscode. Perhaps @happz will have some input on this?
Yep, we're using these rules: 1. no docstring on the same line as """, i.e. every docstring has at least three lines; 2. there's a blank line after a docstring, it makes the code airier. No special handling for docstrings that fit on a single line, that's one exception gone.
It's mostly the surrounding spaces and trailing braces :)
The trailing braces part will be hard. Does
ruff-formatsupporthang-closingfeature ofautopep8? I really cannot understand at all how closing brackets and parenthesis can be placed out of the indented block???!!! 😁 That seems to be going against the main Python principle: Anything which belongs "under" the current line is indented. Want to find where the block ends? Look at the first line with the same or lower indent. That's it. So simple! 🤓 If this rule is broken I'm very much afraid the end of the world is near! 😵
First time hearing about this principle :)
Anyway, I'm not going to try to convince you it's better with the closing brackets aligned with the beginning of the statement - it is, but who am I to argue that :) I just wanted to share how I see it, why I feel it's a sane approach, internally consistent, and why I feel like I'm not breaking this main Python principle.
Want to find where the block ends? Look at the first line with the same or lower indent.
True, but a block like while, for, or if are not the same as a dictionary/list creation or a function call. The first ones are blocks, the latter are expressions. {...} can be given a name, while cannot. Blocks also don't have closing brackets or parenthesis, so, there is nothing to argue there, no closing bracket to indent.
Anything which belongs "under" the current line is indented
foo = [
{
'bar': 'baz'
}
]
Neither ] or } belong "under" the then-current line: ] is not part of the list opened by foo = [, just as } is not part of the dictionary opened by {. They are not the content, just like [ and {. The content alone is indented, and since [ and ] are not list items, { and } are not key/value pairs, they share the indentation level, and only the content is marked by the indentation as something "belonging" to an expression.
So, to paraphrase your question, want to find where the expression ends? On the same column where it started. Simple as that. It cannot end with white space because, unlike blocks, it must have a closing character.
As I see it, it doesn't conflict with blocks, it nicely delimits the content of an expression which is the important part, and with a trailing comma & closing bracket not being part of the final "item" it also leads to simpler patches: foo,\n) => foo,\nbar\n) is just one line added, foo) -> foo,\nbar) changes two lines). Yes, it needs one extra line, but that's fine to me because I use space in code; the dense, compact line is harder to follow, and when adding new items to lists and dictionaries, I do not need to worry about the trailing bracket, it's not affected.
If this rule is broken I'm very much afraid the end of the world is near! 😵
What I always liked is this paragraph from PEP8: https://peps.python.org/pep-0008/#a-foolish-consistency-is-the-hobgoblin-of-little-minds
The guidelines provided here are intended to improve the readability of code and make it consistent across the wide spectrum of Python code. As PEP 20 says, “Readability counts”.
The use of white space, blank lines, and indentation of closing brackets fits, in my books, into this category. They improve the readability. For me. Not following PEP8 or even the "main Python principle" when it makes code easier to read is not the end of the world ;)
Finally in Berlin, I'll start addressing the open issues
@martinhoyer did I accidentally uncommit your changes?
Does ruff-format support one line docstrings like this?
Yes, tested locally, I will add a commit to patch all of these.
Edit: added the commit for this PR to see the pre-commit diffs, it would be cherry-picked out later.
The trailing braces part will be hard. Does
ruff-formatsupporthang-closingfeature ofautopep8?
I don't see that option, and I think it's because of their choice to be an opinionated formatting tool. My thoughts on this is that it is more compatible with the default (non-ruff-format) PyCharm auto-formatting. One issue though is that pyproject.toml formatting would be different
Agreed, at some point of N arguments, it's easier for me to follow them if they are each on its own line, even when they would fit on one line nicely. Maybe there is a control for this, like 4 or more arguments? Put them on new lines!
The control is by the line-length, and if I read it correctly, you can force the splitting by adding the trailing comma. Are we ok with this approach, or should we make it split all of them?
Finally in Berlin, I'll start addressing the open issues
Wohoo!
@martinhoyer did I accidentally uncommit your changes?
Maybe, fwiw, I haven't touched my fork branch https://github.com/martinhoyer/tmt/commits/ruff-format/
I don't see that option, and I think it's because of their choice to be an opinionated formatting tool. My thoughts on this is that it is more compatible with the default (non-ruff-format) PyCharm auto-formatting.
+1
@martinhoyer can you make a PR to bump the pre-commit. I think there would be some changes from the ruff linter
@martinhoyer hi, I am the release lead, can we move this to 1.43, I heard you had an accident :(
Sorry for the slow response here, there was always something which seemed more important than spending too much time on the formatting discussions. But let's get this finally off the table.
Anyway, I'm not going to try to convince you it's better with the closing brackets aligned with the beginning of the statement - it is, but who am I to argue that :) I just wanted to share how I see it, why I feel it's a sane approach, internally consistent, and why I feel like I'm not breaking this main Python principle.
@happz, thanks much for sharing your detailed thoughts on this. It helped me to understand your point of view. The idea around indenting just the content (where brackets and parenthesis do not belong) makes sense and helps me to accept this approach although it does not please my eye.
As I see it, it doesn't conflict with blocks, it nicely delimits the content of an expression which is the important part, and with a trailing comma & closing bracket not being part of the final "item" it also leads to simpler patches:
foo,\n)=>foo,\nbar\n)is just one line added,foo)->foo,\nbar)changes two lines).
Yeah, completely agree that trailing commas and closing bracket on a separate line make sense and make review of changes easier. That benefit applies for both recommended indentations. And if can choose, I'd still prefer the hanging indent.
my_list = [
1, 2, 3,
4, 5, 6,
]
Anyway, it seems that ruff-format is very stubborn about this and if we want to leverage its other benefits there seems to be no other choice. Unless there would be a similar tool which would be less opinionated.
Let's update to latest ruff and go for it? Wdyt @psss? Would you be able to live with the style changes? :)
All in all, I don't want to block this. I can see it could save us from unnecessary future format discussions and could make our life easier by focusing on the coding itself. As it will affect thousands of lines, I just want to make sure that all regular contributors are ok with the change because it will be them spending hours and hours reading the code. See my next comment.
Here's an example demonstrating the proposed formatting to be enforced by ruff:
def tests(
self,
keys: Optional[list[str]] = None,
names: Optional[list[str]] = None,
filters: Optional[list[str]] = None,
conditions: Optional[list[str]] = None,
) -> list[Test]:
"""
One-line summary describing the function
Detailed description with the list of supported parameters and
optionally the return value if it is not obvious from the type
annotation. Paragraphs wrapped to 72 characters.
:param keys: parameter description
:param names: parameter description
:param conditions: parameter description
:returns: list of tests whose names match provided...
"""
# The closing brace/bracket/parenthesis on multiline constructs
# are lined up under the first character of the line that starts
# the multiline construct. (Long comments wrapped to 72 chars.)
my_list = [
1, 2, 3,
4, 5, 6,
]
result = some_function_that_takes_arguments(
'a', 'b', 'c',
'd', 'e', 'f',
)
def fun(number: int):
"""
One line summary is always on a separate line
"""
# Always one empty line after the docstring
if number > 0:
return number * 2
# Include a trailing comma in the list of function parameters to
# prevent ruff folding them into a single line
return other_fun(
first = 1,
second = 2,
third = 3,
)
Please, give a thumbs up if you agree with this formatting approach.
Good, seems we have a nice consensus on this. Let's cover the docstring changes before merging this one in a separate pull request:
- https://github.com/teemtee/tmt/pull/3509
In addition to the single-line mass update, present here as 39a9f2f8ca22e4f712e65569a7ce8cd68e19d580, I've included there a bunch of other manual adjustments to match the agreed format. Should be fairly easy to review.
I've quickly reviewed changes performed by ruff-format and overall I'd say they are reasonable. There are some cases though, where the original multi-line formatting is compressed into a single line which brings inconsistencies. Here's an example:
Before:
_DECIDE_COLORIZATION_TESTCASES = [
# With TTY simulated
DecideColorizationTestcase(
'tty, autodetection',
(True, True),
simulate_tty=True),
DecideColorizationTestcase(
'tty, disable with option',
(False, False),
set_no_color_option=True,
simulate_tty=True),
DecideColorizationTestcase(
'tty, disable with NO_COLOR',
(False, False),
set_no_color_envvar=True,
simulate_tty=True),
DecideColorizationTestcase(
'tty, disable with TMT_NO_COLOR',
(False, False),
set_tmt_no_color_envvar=True,
simulate_tty=True),
DecideColorizationTestcase(
'tty, force with option',
(True, True),
set_force_color_option=True,
simulate_tty=True),
After:
_DECIDE_COLORIZATION_TESTCASES = [
# With TTY simulated
DecideColorizationTestcase('tty, autodetection', (True, True), simulate_tty=True),
DecideColorizationTestcase(
'tty, disable with option', (False, False), set_no_color_option=True, simulate_tty=True
),
DecideColorizationTestcase(
'tty, disable with NO_COLOR', (False, False), set_no_color_envvar=True, simulate_tty=True
),
DecideColorizationTestcase(
'tty, disable with TMT_NO_COLOR',
(False, False),
set_tmt_no_color_envvar=True,
simulate_tty=True,
),
DecideColorizationTestcase(
'tty, force with option', (True, True), set_force_color_option=True, simulate_tty=True
),
I find this mixed one-line/multi-line format quite confusing and to me it looks like a substantial regression in readability. It's much harder to see the actual differences between individual test cases.
Seems we have two options here:
- keep it as it's reformatted by ruff and live with it
- convince ruff to keep multi-line by adding trailing comma
Here's how it would look like with the trailing commas added:
_DECIDE_COLORIZATION_TESTCASES = [
# With TTY simulated
DecideColorizationTestcase(
'tty, autodetection',
(True, True),
simulate_tty=True,
),
DecideColorizationTestcase(
'tty, disable with option',
(False, False),
set_no_color_option=True,
simulate_tty=True,
),
DecideColorizationTestcase(
'tty, disable with NO_COLOR',
(False, False),
set_no_color_envvar=True,
simulate_tty=True,
),
DecideColorizationTestcase(
'tty, disable with TMT_NO_COLOR',
(False, False),
set_tmt_no_color_envvar=True,
simulate_tty=True,
),
DecideColorizationTestcase(
'tty, force with option',
(True, True),
set_force_color_option=True,
simulate_tty=True,
),
What are your thoughts on this?
Doesn't it force it to multi-line if you manually add a comma at the end. We checked on this option somewhere in the history and there is an option for it.
Edit: Yeah it's skip-magic-trailing-comma which we did not manually set. The issue is that it was missing the trailing comma because ) was closing the last keyword
DecideColorizationTestcase(
'tty, autodetection',
(True, True),
- simulate_tty=True),
+ simulate_tty=True,),
(Oh, morning brain cannot read, that was already discussed)
I've quickly reviewed changes performed by
ruff-formatand overall I'd say they are reasonable. There are some cases though, where the original multi-line formatting is compressed into a single line which brings inconsistencies. Here's an example:...
Seems we have two options here:
- keep it as it's reformatted by ruff and live with it
- convince ruff to keep multi-line by adding trailing comma
What are your thoughts on this?
There is also secret option number 3, dropping the idea of ruff-format completely because the number of inconsistencies would be high enough to counteract the goal of releasing us from the burden of discussing formatting.
I agree the example above is indeed less than perfect, I for one would prefer all entries follow the same formatting. But, 1. I would survive, and 2. if it would be one of the few such "butchered" lists, I would be fine adding trailing commas to improve the formatting of said list, on the assumption it's an isolated, self-contained piece of code. But even that is a borderline for me.
I believe the question is, how many other places are there in code where ruff-format decisions would lead to formatting deemed as "regression in readability"? A couple of places, each preferably fitting on a single screen so he who edits can see why the special formatting hint is needed? I myself tailored lists of parameters for test cases so they would be easier to follow. But is it more than "a few"? Would we have to add trailing comma hints to methods, lists, dictionaries, or decorators of all kinds and colors? If so, then I'd say we have a problem, and adding ruff-format would be counterproductive, because we would have a tool to control formatting and we would also follow a set of rules fighting the tool in many (?) cases.
I don't like all opinions of "opinionated" ruff-format, I find some of them questionable. But I could live with them because I see benefits in not discussing indentation during reviews, and none of the said opinions were anywhere near a hill I would die on. I don't want touse the tool while still considering formatting and fighting the tool to satisfy the custom rules.
Well, most of the issues here is that a strict formatter was not enforced and this would deal with the downfall of that. I don't think there is a good way of dealing with such a PR other than going through each change gradually accumulating accepted changes and flagging the formatting changes to be discussed as @psss did.
I believe the question is, how many other places are there in code where ruff-format decisions would lead to formatting deemed as "regression in readability"?
I don't think we will be able to find out without gradually fixing the pre-commit. I was considering how to regex find for the case of missing trailing comma, but I couldn't find a good regex for that because of tuples like (True, False) :slightly_frowning_face:
I don't like all opinions of "opinionated"
ruff-format, I find some of them questionable.
Same, but it seems that you cannot make a consistent formatter without being opinionated.
How about this plan:
- We make a branch
ruff-formatof the currentmainjust to have a fixed place to discuss on - We cherry-pick the pre-commit commit onto branch
apply-ruff-formatand make a PR targetingruff-format - Either on
apply-ruff-formator individual PRs we go through the changes and pick out the obvious ones that we want applied, or systematic ones like adding trailing comma - When all changes are in place to pass pre-commit, we cherry-pick the systematic changes, and rebase
apply-ruff-formatto targetmainand let pre-commit re-format the remaining changes
- keep it as it's reformatted by ruff and live with
- convince ruff to keep multi-line by adding trailing comma
Vote for 2 from me. It's not far from what the original format was already at.
There is also secret option number 3, dropping the idea of
ruff-formatcompletely because the number of inconsistencies would be high enough to counteract the goal of releasing us from the burden of discussing formatting.
After reviewing in 5280db512b1eb99074eb35ba58272748b57a8f2b about 30 random files in from different parts of the code, my overall impression is that the final output is ok and that it's worth it. Just for those places where multiline is more readable, the trailing comma fixes that. But yeah, there are quite many such places.
I believe the question is, how many other places are there in code where
ruff-formatdecisions would lead to formatting deemed as "regression in readability"? A couple of places, each preferably fitting on a single screen so he who edits can see why the special formatting hint is needed? I myself tailored lists of parameters for test cases so they would be easier to follow. But is it more than "a few"? Would we have to add trailing comma hints to methods, lists, dictionaries, or decorators of all kinds and colors? If so, then I'd say we have a problem, and addingruff-formatwould be counterproductive, because we would have a tool to control formatting and we would also follow a set of rules fighting the tool in many (?) cases.
It's an additional work, yes, and it's not just a few. On the other hand it's just a one time task. When writing new code we can keep this on mind, I don't see a problem with including an extra comma, which I'm doing anyway (now that I learned that it's possible for function calls as well). Thanks to this, adding new arguments will be easier.
I don't like all opinions of "opinionated"
ruff-format, I find some of them questionable.
Yeah, similar with me. But I could live with them because I see benefits in not discussing indentation during reviews, and none of the said opinions were anywhere near a hill I would die on.
I don't want touse the tool while still considering formatting and fighting the tool to satisfy the custom rules.
From my brief experience, by adding trailing commas I was able to get output which was reasonable to my eye. Didn't feel like fighting with it too much. The rest of the code sounds fine to me. @happz, do you find the changes in 5280db512b1eb99074eb35ba58272748b57a8f2b reasonable? Or do you see even there (too much) stuff to fight with?
It's an additional work, yes, and it's not just a few. On the other hand it's just a one time task. When writing new code we can keep this on mind...
^^ This makes it the opposite of a one-time thing.
I don't see a problem with including an extra comma, which I'm doing anyway (now that I learned that it's possible for function calls as well). Thanks to this, adding new arguments will be easier.
Isn't that effectively disabling ruff-format capability of reformatting any non-trivial structure? Adding a trailing comma tells ruff-format to keep multiline expression as it is, even when it's - by mistake - not matching its surroundings.
FOO = [
(1, 2, 3)
(5, 6, 7)
(
7,
8,
9,
)
]
ruff-format will not touch the last tuple even when it clearly conflicts with the rest of the list, because I managed to add a trailing comma. Someone will have to notice it in the review and ask me to remove it, or we will get inconsistent formatting. This is the issue I hate to see sneaking in, for the sake of consistency we keep humans involved and invite another kind of inconsistency in.
From my brief experience, by adding trailing commas I was able to get output which was reasonable to my eye. Didn't feel like fighting with it too much. The rest of the code sounds fine to me. @happz, do you find the changes in 5280db5 reasonable? Or do you see even there (too much) stuff to fight with?
I'm not really sure how can I judge that. What I see is a reformatted source that looks acceptable. I can't tell though how much of it is ruff-format and how much is you preventing ruff-format from reformatting the code by adding a trailing comma to expressions ruff-format would otherwise modify.
I'm clearly in opposition to the proposal, and I am out of arguments, and honestly, out of breath as well. Please, proceed in a way you believe is the best, and I will move on to another issues.
I'm clearly in opposition to the proposal
Wait, @happz is against ruff format in general?
I'm clearly in opposition to the proposal
Wait, @happz is against ruff format in general?
No, just against keeping humans responsible for formatting by using trailing commas in a significant amount of cases. "proposed approach" would be a better term, I admit that.