pylint
pylint copied to clipboard
[R0801] Similarities false positive on abstract classes
Bug description
Having two classes with 5 methods that have a pass
implementation (assuming the default minimum 4 lines of similarity) results in a false positive, where methods with different names end up triggering the rule. This would be a false positive since the minimum number of matching lines seem to be getting applied to all pass
statements in the entire class, not on a per function basis.
In my specific use case I have two abstract classes which both have 5+ abstract methods to be overridden by the implementations. As a result, the similarities rule is getting triggered because there are 5 consecutive method definitions with implementations consisting of pass
.
Interestingly enough, replacing all usages of pass
with ...
fixes the error.
I played around with the code a little bit and found that in this if condition: https://github.com/PyCQA/pylint/blob/main/pylint/checkers/similar.py#L533
In my specific scenario, all lines of code between fst_file_start
and fst_file_end
or snd_file_start
and snd_file_end
all consisted of the pass
keyword and nothing else, with eff_cmn_nb
being 5, which triggered the rule.
The scenario is created with two files:
File one - test.py
class Processor:
def aaa(self) -> None:
pass
def baa(self) -> None:
pass
def caa(self) -> None:
pass
def daa(self) -> None:
pass
def gaa(self) -> None:
pass
File two - test2.py
class AnotherProcessor:
def haa(self) -> None:
pass
def zaa(self) -> None:
pass
def xaa(self) -> None:
pass
def vaa(self) -> None:
pass
def waa(self) -> None:
pass
Configuration
[tool.pylint."MESSAGES CONTROL"]
max-line-length = 120
disable = [
"C0114", # Missing module docstring
"C0115", # Missing class docstring
"C0116", # Missing function docstring
]
load-plugins = [
"pylint.extensions.no_self_use"
]
[tool.pylint.SIMILARITIES]
ignore-imports = "yes"
[tool.poetry]
name = "pylint-report"
version = "1.0.0"
description = ""
authors = ["A B <[email protected]>"]
[tool.poetry.dependencies]
python = "~3.9"
[tool.poetry.dev-dependencies]
pylint = "2.14.5"
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
Command used
pylint test.py test2.py
Pylint output
************* Module test2
test2.py:1:0: R0801: Similar lines in 2 files
==test2:[3:16]
==test:[3:16]
pass
def baa(self) -> None:
pass
def caa(self) -> None:
pass
def daa(self) -> None:
pass
def gaa(self) -> None:
pass (duplicate-code)
------------------------------------------------------------------
Your code has been rated at 9.55/10 (previous run: 9.55/10, +0.00)
Expected behavior
The similarities (R0801) rule shouldn't be triggered when 2 abstract classes in different files each with (min-similarity-lines + 1) abstract methods with a "pass" implementation exist. The rule should only be triggered by the body of each function being a duplicate, not the aggregate of 5 functions with pass
implementations matching against another abstract class.
Pylint version
pylint 2.14.5
astroid 2.11.7
Python 3.9.10 (main, Jan 15 2022, 11:48:04)
[Clang 13.0.0 (clang-1300.0.29.3)]
OS / Environment
macOS Monterey Version 12.4
Additional dependencies
astroid==2.11.7 dill==0.3.5.1 isort==5.10.1 lazy-object-proxy==1.7.1 mccabe==0.7.0 platformdirs==2.5.2 pylint==2.14.5 tomli==2.0.1 tomlkit==0.11.1 typing-extensions==4.3.0 wrapt==1.14.1
I don't think abstract classes should be exempted from the duplicate code check. If function signatures are ignored I think the check is working as intended. (I'm going to wait for another opinion before closing).
I agree that abstract classes shouldn't be exempted from the duplicate code check, but that's not really my ask here.
The issue here is that while ignore-signatures
does in fact ignore function signatures, it doesn't take the structure of the code into account, which doesn't really make sense to me given the name of the configuration option.
In the example I provided above, the similarities check is ran on the result of a class having all its method signatures stripped, which is equivalent to just the class name and 5 pass
statements. But that's not really duplicate code, especially since each individual function's contents are not evaluated against the min-similarity-lines
. This check is now running against this code:
class Processor:
pass
pass
pass
pass
pass
Which structurally is completely different from the original, and will fail whenever any other class has 5 functions with a pass
body, even though that's not duplicate code.
At least my interpretation of what ignore-signatures
is supposed to do, is that it won't take the function signatures into account of the similarities check, but will instead only look at the contents of the functions or other code. It shouldn't completely remove signatures and assume that the code being looked at is one block of code. Maybe this interpretation is incorrect, but then that's not really documented as explicitly as it could be: https://pylint.pycqa.org/en/latest/user_guide/configuration/all-options.html#ignore-signatures
I definitely missed that 😄
Interestingly, pylint test.py test2.py -j 2
passes.
Starting with pylint 2.14.0, I have code that suddenly triggers R0801 pylint errors when running pylint with only a single worker. It used to work fine before.
Edit: Using pylint 2.13.0 also works with the provided example.
Has there been any progress on this? We were using ignore-signatures because otherwise multilined method signatures hit the duplicate limit when you implement an abstract method. But now we can't because using ignore-signatures means two unrelated interfaces are being flagged as duplicates as in this issue. We'd like to avoid having to increase the min-similarity-lines across the whole codebase to more than the longest interface/longest method signature in an interface as it does usefully flag repeated code, so any updates would be very appreciated.
No one worked on it yet, but I'll happily review a merge request fixing this issue :) Do you want to contribute a fix @blaise17 ?