Apply more ruff rules
Enforce these rules:
- flake8-pie (PIE)
- flake8-pyi (PYI)
- flake8-slots (SLOT)
- flake8-type-checking (TCH)
- pygrep-hooks (PGH)
- Ruff-specific rules (RUF)
Also apply assorted rules from these rulesets:
- flake8-comprehensions (C4)
- flake8-executable (EXE)
- flake8-raise (RSE)
- flake8-return (RET)
- flake8-simplify (SIM)
I find these rules improve consistency and readability. I decided to enforce the rulesets I am familiar with from other projects, and merely apply isolated rules when I fear a ruleset might generate too many false positives.
Can you add some details to your PR description which groups you're adding and why.
Can you add some details to your PR description which groups you're adding and why.
Agreed. This is turning into a pretty huge PR, all of which is essentially just cosmetic changes. I don't want to get into a big debate over "what rules should we enable" but equally I don't think we should simply enable rules because they are there. Apart from anything else, this will likely cause a lot of merge conflicts with the various PRs we currently have outstanding - and it's difficult enough to find time to progress those without dealing with the admin of what is essentially changes in project policy.
To be frank, I'm currently -1 on this change, unless the rules being enforced have some visible benefit to this project.
Most linter rules don't dramatically improve individual pieces of code as such. Yet, you have already enabled some rules. Why?
Like the rules you have already enabled, these new rules marginally improve some pieces of code. More importantly, they globally improve consistency and readability.
Note that recent commits such as 26c6a458bb4dc99f670230b467d22c84ac11c264 would have been caught by rules such as SIM118.
How about splitting this request into multiple requests?
- Apply the rules, but do not enforce them.
- Enforce the rules later on, gradually.
Alternatively, I can split into a PR for each ruleset.
The PR is made of small commits, each one applying a rule. The explanation for each rule is available in the Ruff Rules documentation. Copying the rationale for each rule here doesn't seem useful.
Hello!
I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will eligible for code review and hopefully merging!
The explanation for each rule is available in the Ruff Rules documentation. Copying the rationale for each rule here doesn't seem useful.
Not all explanations in the Ruff documentation are particularly convincing though, and in some cases they are subjective and contrary to the code style pip has historically followed.
Here is a recent addition to the ruff ruleset https://github.com/pypa/pip/pull/12894 in pip's codebase, where each rule that creates a change to the codebase is justified as worth enabling or not, and if configuration is available why certain configuration choices were made or not made.
The problem is, I have no way of reviewing this PR (other than blindly saying "let's go for it"). And while I'm not suggesting this is the case, it would be so easy to sneak in something malicious in a PR of this size, where the changes are, let's be honest, mostly boring.
Mostly my understanding is that what we have is the default rule sets, or ones that are commonly acknowledged to be worthwhile. You're adding 60% more rules, with no real explanation beyond the fact that you find them helpful.
I really don't want to get into an extended debate here - the whole point of using a linter to enforce style choices is so that we don't have these sorts of debates, that ultimately degenerate into personal preference. And I'll freely admit that I'm not a fan of enforcement, I prefer coding style to be a choice, and a matter of taste. But it does seem like this might be a step too far.
I don't get it. While you're not a fan of enforcement, you already use a formatter and a linter, which don't leave much choice. I'm adding some rules that I find useful. Previous rules have not met that much opposition.
Again, as an example, commit 26c6a458bb4dc99f670230b467d22c84ac11c264 was accepted. It would have been enforced automatically by SIM118. What's wrong with SIM118 that's good with 26c6a458bb4dc99f670230b467d22c84ac11c264?
I understand one of the main concerns is that the PR looks huge. The number of commits might seem impressive, but if you have a look at the actual changes, they are limited. How about I split into multiple PRs, one per ruleset?
Again, as an example, commit 26c6a45 was accepted. It would have been enforced automatically by SIM118. What's wrong with SIM118 that's good with 26c6a45?
This seems like a really good justification to add that rule, it seems like an easy non-controversial PR, here’s a previous example of adding a single rule with a good justification: https://github.com/pypa/pip/pull/12393
I can try to split into multiple PR's. It would be easier to:
- review the changes,
- evaluate the usefulness of each ruleset.
Indeed, I have applied some SIM rules, but do not enforce them. My experience so far is that many of these SIM rules modify the existing code too much, without real benefits in terms of readability.
I don't get it.
I wasn't in favour of strict automatic enforcement when it was first proposed. But the other maintainers preferred it, so I accepted. Over time, I've been persuaded that it's a useful thing to have - but that's very much because we picked a set of rules to apply and stuck to them. What I don't like about this PR is that it changes our config to add a bunch of new rules, with no better justification than the fact that you think they are worth adding.
If someone proposed adding a new feature to pip with as little justification as this PR has, we'd reject it out of hand. While I appreciate that the bar is lower for changes that have no user-visible impact, I do think we should still expect some level of explanation as to why these changes are worth it.
I've stepped through and reviewed this entire PR
OK, I'm happy to accept your conclusions. My main point here was that the cost of reviewing this big of a change is significant - but if you've done that, then there's no further cost there. But my other point remains - are you OK with the impact this change will have on "in flight" PRs? Potentially a lot of merge conflicts, and possibly follow-up work as code in PRs won't have been checked against these new rules.
I've made my point here, though. I'm happy to defer to @pradyunsg on whether this should be merged.
I believe the justification for most of the rules is in the rules themselves, but I clearly need to explain better.
Let me repeat what I've written in the PR about previews. I see a difference between formatters and linters, even if the following attempt for definitions shows overlaps:
- The formatter applies a consistent style that's supposed to improve readability and reduce future churn. Style should not change much.
- In addition to style, the linter applies rules that are supposed to improve the code and avoid common pitfalls. Many
EandFrules can be seen as style issues, but theC4rules are more about performance. It might make sense to apply new rules when they avoid common pitfalls.
Because each ruleset is different, it might be easier to split into multiple PR, one for each ruleset. I have started with #12983 which disables debatable UP rules and applies a preview rule with minimal code changes. If you prefer such smaller PRs, with more explanations about the rationale behind the rules, I am happy to split this PR.
I see a difference between formatters and linters, even if the following attempt for definitions shows overlaps
I'm not sure what point you are making here. Our code style is (by project policy) whatever black implements. As black's style has remianed stable, this seems like a reasonable choice, and I don't think anyone is arguing about that.
However, lint rules are far more arguable, being very much more a "personal preference" matter. In addition, ruff has not been stable in terms of its default ruleset, which seems to be changing fairly often. We've made a decision to require that our selected rules are followed, and (IIRC) we use --fix to enforce that. But we have not, as far as I'm aware, made a policy decision to apply ruff's default ruleset without question - nor should we, IMO, given some of the discussions you linked to above. Therefore, we should require justification for changing the ruleset we apply. And that justification should be more than just personal preference - the whole point of using a formatter and linter is to stop style debates that are based around personal preference. Furthermore, pointing to the documentation of the rule isn't sufficient - that just says what the rule does, and doesn't explain why it's a good rule for pip.
Anyway, we're going round in circles here. You and I clearly disagree, and that's fine. I don't support the broad brush application of new ruff rules that you're proposing, and that's also fine. You won't persuade me to merge your proposals, but I've already said that I'm willing to defer to @pradyunsg who's done an actual detailed review of this PR.
I'll also note, for the record, that I don't necessarily disagree with the logic behind the rules, nor am I saying that the code changes here might not be worthwhile. But I'm saying that we should let the code author make those decisions, not defer them to a tool. That's the downside of mandatory[^1] lint rules - they take choice away from the developer. And I think we should be cautious about doing that.
[^1]: As opposed to a report that can be reviewed and judged on its merits. That was, after all, the original intention of linters - to provide information, not to enforce rules.
There has been recent discussion to remove ruff rule groups in favor of enabling individual rules are all well justified: https://github.com/pypa/pip/pull/13284
I'm going to make a PR soon that does this with the flake-bugbear (B) group and keep working from there.
As such I'm going to close this PR as it's unlikely to ever be accepted, though I do appreciate your reasoning and effort, and I think it would be valid and accepted for many projects.