vulture icon indicating copy to clipboard operation
vulture copied to clipboard

Improve unreachable code analysis

Open kreathon opened this issue 2 years ago • 5 comments

Motivation

def f():
    try:
        return
    except:
        raise Exception()
    print("Unreachable")

At the moment, the print statement is not identified as unreachable.

Implementation

I added a data structure (no_fall_through_nodes) that stores ast nodes that do not allow a fall through. During the ast traversal continue, break, raise and return are added into the this data structure. For every control flow statement / object (for, try, while, with, but also module and functiondef), we check if any of the statement in the body is in no_fall_through_nodes. If that is the case, we report the next statement (if there is any) and add the current node also into no_fall_through_nodes. To make this work generic_visit (means visiting children) is now executed before visiting the current node.

The algorithm was added into the existing Vulture object with the generic_visit (which handles recursion). I think a cleaner implementation could implement a separate node visitor (that handles its own recursion), but I was not sure if the project is open for that.

The old error reporting message is reused (e.g. unreachable code after 'try'). I am not sure if this is the best message or if all unreachable messages should be simplified to something like unreachable code.

Limitations

  • does not support match statements
  • does not support with statements (context manager can suppress exceptions)

Related Issue

This PR is addressing the issue https://github.com/jendrikseipp/vulture/issues/270.

Checklist:

  • [x] I have updated the documentation in the README.md file or my changes don't require an update.
  • [x] I have added an entry in CHANGELOG.md.
  • [x] I have added or adapted tests to cover my changes.
  • [x] I have run tox -e fix-style to format my code and checked the result with tox -e style.

kreathon avatar Feb 17 '23 10:02 kreathon

Codecov Report

Merging #302 (a59eac4) into main (fc23f33) will increase coverage by 0.06%. Report is 13 commits behind head on main. The diff coverage is 100.00%.

:exclamation: Current head a59eac4 differs from pull request most recent head 8d9fcf4. Consider uploading reports for the commit 8d9fcf4 to get more accurate results

@@            Coverage Diff             @@
##             main     #302      +/-   ##
==========================================
+ Coverage   98.96%   99.03%   +0.06%     
==========================================
  Files          21       21              
  Lines         679      727      +48     
==========================================
+ Hits          672      720      +48     
  Misses          7        7              
Files Changed Coverage Δ
vulture/core.py 98.75% <100.00%> (+0.21%) :arrow_up:
vulture/whitelists/ast_whitelist.py 100.00% <100.00%> (ø)

... and 2 files with indirect coverage changes

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov-commenter avatar Feb 17 '23 10:02 codecov-commenter

This goes in the direction of adding a second NodeVisitor class, but avoids traversing the AST twice. Or do you think a second NodeVisitor class (with all of these visitor_* functions) would be a better alternative?

I do not see a clear advantage of one of the solutions and it probably does not matter too much right now (if the reachability is in a second module it should be really easy to switch between them anyway).

I will update the code according to your proposed code organization.

kreathon avatar Sep 03 '23 10:09 kreathon

I am not sure if passing _define as report into Reachaiblity is the cleanest solution, but I did not want to create a new datastructure that passes the information back to the Vulture object. What do you think?

kreathon avatar Sep 03 '23 10:09 kreathon

I had a look at it again, and I think it is cleaner to also move the _handle_conditional_node into the reachablitly.py (such that the module handles the entire self.unreachable_code LoggingList data structure).

I will propose the change as soon as I find time for it.

kreathon avatar Sep 05 '23 17:09 kreathon

I updated

  • check_unreachable for checking multiple unreachable code segments (alternative would be to add another method for that or assert that there is only a single unreachable code segment separately)
  • move some tests from test_conditions.py into test_reachability.py
  • merge handle_condition_node() into Reachability to get more powerful analysis results. There are still cases that are not handled (we need to distinguish between raise/return and break/continue for that).

For example.

while True:
    raise Exception()
print(b)

I could handle this in another PR.

kreathon avatar Sep 09 '23 09:09 kreathon