vulture
vulture copied to clipboard
False positive when enum is used in iteration
Enum classes allow iteration over the class, which returns all enum members. This is often convenient particularly when writing tests. Unfortunately, vulture does not understand this and reports enum members as unused. For example:
from enum import Enum
class E(Enum):
A = 1
B = 2
list(E)
produces these errors:
$ python -m vulture test_enum.py
test_enum.py:4: unused variable 'A' (60% confidence)
test_enum.py:5: unused variable 'B' (60% confidence)
This can be whitelisted, but it's a bit tedious when there are many enum members.
Some ideas on how this can be improved:
- It may be possible to detect some cases of iteration like
list(E)
,tuple(E)
,a, b, c = E
,for e in E
and mark all members as used. - For a work-around, adding a class-level comment or annotation that all members are used can make whitelisting much easier. This can help in other cases as well, for example this issue about dataclasses: #264.
- Maybe if the enum class is listed in
__all__
, all members can be considered used?
Thanks for the report! All proposed solutions have in common that they need to find out which variables are defined in the scope of the class. Since Vulture currently has no notion of scope, they are tricky to implement. If you see a solution, I'm all ears :-)
@jendrikseipp That sounds like an even bigger problem :-) Adding scope information should improve the accuracy quite a bit. Python scoping rules are not very complicated.
There are some libraries that enrich the abstract syntax tree (on which Vulture operates) with scoping information, but I haven't looked deeper into them. I think that could be a good starting point.
Can I work on this?
Sure, happy to see progress here! It might be a good idea to discuss the high level approach though, before you spend time on implementing the details. If you have a high-level plan, I'm happy to discuss it.
@jendrikseipp looked into this a little more, and wanted to run the current implementation idea by you...
Progress thus far:
- Identified that Vulture generates an AST of the provided source code in the
scan
function incore.py
- Each node in the AST is then visited by the
visit
function - The
visit
function decides what helper to use based on the type of node - For this issue, we specifically care about the
visit_Call
andvisit_ClassDef
functions because these are invoked when thelist()
function is called, and when a class is defined with theEnum
subclass.
Based on these findings, here is a high level approach of the solution:
- Extend the ast with scoping information using the asttokens package.
- In
visit_Call
, check if thenode.func.id=='list'
- are we calling the list function? - If so, get the argument(
node.args[0].id
) and use the scoping information provided by asttokens to check if the argument is a class - If so, use the scope info about the class to check if
Enum
is a superclass. - If so, then add all the fields defined in the class to
defined_variables
Yes, your explanation makes sense and the approach is correct, I'd say. (In the last step, you probably mean "used_variables".)
The only thing I'm unsure about is the AST library. Have you compared asttokens with https://github.com/kavigupta/ast_scope and https://github.com/pylint-dev/astroid ? We need to pick one that does the job, is well-maintained and doesn't have too many dependencies.