PERF403 update suggestion will make code slower
Summary
Was looking at PERF rules, noticed one of the transformations it suggests will make code like 1.4x slower
import timeit
def main1():
d = {}
for x in range(8):
for i, j in zip(range(x), range(x)):
d[i] = j
def main2():
d = {}
for x in range(8):
d.update({i: j for i, j in zip(range(x), range(x))})
if __name__ == "__main__":
print(timeit.timeit(main1, number=10_000))
print(timeit.timeit(main2, number=10_000))
λ python test.py
0.01902620808687061
0.026080582989379764
Version
No response
It's also a verbose fix, and at least in this case a much better fix should just be:
def main3():
d = {}
for x in range(8):
d.update(zip(range(x), range(x)))
When run on my machine with 3.14, it beats the first two versions:
0.01532745803706348
0.019240333000198007
0.012530624982900918
Yup! In this and https://github.com/astral-sh/ruff/issues/21891 , the things that trigger my "this rule is probably counterproductive" sense are 1) creating generators where none existed previously, 2) instantiating temporary dicts/lists
To me this and https://github.com/astral-sh/ruff/issues/21891 seem like pretty clear bugs. I don't think PERF autofixes or recommendations should ever make your code slower — if the motivation for a rule is more "stylistic consistency" rather than "micro-optimisations", then I don't think the rule belongs in the PERF category at all. Even if we decide we don't want to make any changes here to how the rule works, at the very least I think we should document that the rule sometimes makes suggestions that can lead to slowdowns — as a Ruff user, that's not at all what I'd expect a PERF rule to ever do
A general issue with the PERF rules is that most of them seem dependent on CPython internals. It may very well be version specific whether these suggestions reliably give a speedup or not (e.g. https://docs.astral.sh/ruff/rules/try-except-in-loop/ where we end up skipping the rule in version 3.11+). Not sure the best way to handle this. Maybe we should be adding micro-benchmarks that get checked periodically on various Python versions? (seems a little unwieldy)
Yeah, that's why I didn't immediately consider these to be bugs. I found at least one case in #21891 where the extend version is actually faster. Assuming we don't want to maintain such a list of microbenchmarks, I would probably lean toward framing these rules as stylistic rather than performance-oriented, as Alex mentioned (and some day moving them out of a PERF or performance category entirely).
On the other hand, it also seems reasonable to me to drop the extend/update parts of the rules. Those intuitively feel like they would be slower and also don't feel like as much of a stylistic improvement to me as replacing a whole loop with a single comprehension.