ruff icon indicating copy to clipboard operation
ruff copied to clipboard

[`ruff`] Re-implement `unreachable`

Open augustelalande opened this issue 1 year ago • 8 comments

Summary

This PR re-introduces the control-flow graph implementation which was first introduced in #5384, and then removed in #9463 due to not being feature complete. Mainly, it lacked the ability to process try-except blocks, along some more minor bugs.

I will now highlight the major changes implemented in this PR, in order of implementation.

  1. Introduced a post-processing step in loop handling to find any continue or break statements within the loop body and redirect them appropriately.
  2. Introduced a loop-continue block which is always placed at the end of loop blocks, and ensures proper looping regardless of the internal logic of the block. This resolves #8958.
  3. Implemented try processing with the following logic (resolves #8959):
    1. In the example below the cfg first encounters a conditional ExceptionRaised forking if an exception was (or will be) raised in the try block. This is not possible to know (except for trivial cases) so we assume both paths can be taken unconditionally.
    2. Going down the try path the cfg goes try->else->finally unconditionally.
    3. Going down the except path the cfg will meet several conditional ExceptionCaught which fork depending on the nature of the exception caught. Again there's no way to know which exceptions may be raised so both paths are assumed to be taken unconditionally.
    4. If none of the exception blocks catch the exception then the cfg terminates by raising a new exception.
    5. A post-processing step is also implemented to redirect any raises or returns within the blocks appropriately.
def func():
    try:
        print("try")
    except Exception:
        print("Exception")
    except OtherException as e:
        print("OtherException")
    else:
        print("else")
    finally:
        print("finally")
flowchart TD
  start(("Start"))
  return(("End"))
  block0[["`*(empty)*`"]]
  block1["print(#quot;finally#quot;)\n"]
  block2["print(#quot;else#quot;)\n"]
  block3["print(#quot;try#quot;)\n"]
  block4[["Exception raised"]]
  block5["print(#quot;OtherException#quot;)\n"]
  block6["try:
        print(#quot;try#quot;)
    except Exception:
        print(#quot;Exception#quot;)
    except OtherException as e:
        print(#quot;OtherException#quot;)
    else:
        print(#quot;else#quot;)
    finally:
        print(#quot;finally#quot;)\n"]
  block7["print(#quot;Exception#quot;)\n"]
  block8["try:
        print(#quot;try#quot;)
    except Exception:
        print(#quot;Exception#quot;)
    except OtherException as e:
        print(#quot;OtherException#quot;)
    else:
        print(#quot;else#quot;)
    finally:
        print(#quot;finally#quot;)\n"]
  block9["try:
        print(#quot;try#quot;)
    except Exception:
        print(#quot;Exception#quot;)
    except OtherException as e:
        print(#quot;OtherException#quot;)
    else:
        print(#quot;else#quot;)
    finally:
        print(#quot;finally#quot;)\n"]

  start --> block9
  block9 -- "Exception raised" --> block8
  block9 -- "else" --> block3
  block8 -- "Exception" --> block7
  block8 -- "else" --> block6
  block7 --> block1
  block6 -- "OtherException" --> block5
  block6 -- "else" --> block4
  block5 --> block1
  block4 --> return
  block3 --> block2
  block2 --> block1
  block1 --> block0
  block0 --> return
  1. Implemented with processing with the following logic:
    1. with statements have no conditional execution (apart from the hidden logic handling the enter and exit), so the block is assumed to execute unconditionally.
    2. The one exception is that exceptions raised within the block may result in control flow resuming at the end of the block. Since it is not possible know if an exception will be raised, or if it will be handled by the context manager, we assume that execution always continues after with blocks even if the blocks contain raise or return statements. This is handled in a post-processing step.

Test Plan

Additionaly test fixtures, and control-flow fixtures were added.

augustelalande avatar Apr 11 '24 22:04 augustelalande

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+21 -0 violations, +0 -0 fixes in 5 projects; 49 projects unchanged)

RasaHQ/rasa (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ tests/core/test_exporter.py:312:13: RUF014 Unreachable code in `_mocked_fetch`
apache/airflow (+15 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ airflow/providers/jenkins/operators/jenkins_job_trigger.py:75:5: RUF014 Unreachable code in `jenkins_request_with_headers`
+ airflow/triggers/base.py:102:9: RUF014 Unreachable code in `run`
+ airflow/triggers/testing.py:52:13: RUF014 Unreachable code in `run`
+ dev/example_dags/update_example_dags_paths.py:50:5: RUF014 Unreachable code in `check_if_url_exists`
+ tests/models/test_baseoperator.py:1055:21: RUF014 Unreachable code in `my_work`
+ tests/models/test_mappedoperator.py:1040:21: RUF014 Unreachable code in `other_setup`
+ tests/models/test_mappedoperator.py:1056:21: RUF014 Unreachable code in `my_setup`
+ tests/models/test_mappedoperator.py:1082:21: RUF014 Unreachable code in `other_setup`
+ tests/models/test_mappedoperator.py:1257:21: RUF014 Unreachable code in `my_work`
+ tests/models/test_mappedoperator.py:1274:21: RUF014 Unreachable code in `my_work`
+ tests/models/test_mappedoperator.py:1581:17: RUF014 Unreachable code in `my_work`
+ tests/models/test_mappedoperator.py:1620:25: RUF014 Unreachable code in `my_work`
+ tests/models/test_mappedoperator.py:1668:25: RUF014 Unreachable code in `my_work`
+ tests/models/test_mappedoperator.py:968:21: RUF014 Unreachable code in `my_setup`
+ tests/models/test_taskinstance.py:3444:13: RUF014 Unreachable code in `on_finish_callable`
pandas-dev/pandas (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ pandas/io/parsers/python_parser.py:1315:25: RUF014 Unreachable code in `_get_lines`
python-poetry/poetry (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ src/poetry/utils/authenticator.py:256:9: RUF014 Unreachable code in `request`
pytest-dev/pytest (+3 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ src/_pytest/config/__init__.py:1834:9: RUF014 Unreachable code in `_assertion_supported`
+ testing/code/test_code.py:151:17: RUF014 Unreachable code in `test_bad_getsource`
+ testing/code/test_code.py:167:17: RUF014 Unreachable code in `test_getsource`
Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
RUF014 21 21 0 0 0

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

github-actions[bot] avatar Apr 11 '24 22:04 github-actions[bot]

Nice, I'm a fan of this.

charliermarsh avatar Apr 11 '24 23:04 charliermarsh

No promises, just started taking a look at it.

augustelalande avatar Apr 11 '24 23:04 augustelalande

Can you give me some background on why this was never activated? Was there just too many false positives?

augustelalande avatar Apr 11 '24 23:04 augustelalande

No, I think it's fairly reliable, but I believe try-except handling wasn't quite finished: https://github.com/astral-sh/ruff/issues/8959. There's also at least one bug to fix: https://github.com/astral-sh/ruff/issues/8958.

charliermarsh avatar Apr 12 '24 00:04 charliermarsh

CodSpeed Performance Report

Merging #10891 will not alter performance

Comparing augustelalande:unreachable (a168d97) with main (66fe226)

Summary

✅ 32 untouched benchmarks

codspeed-hq[bot] avatar Apr 16 '24 00:04 codspeed-hq[bot]

This is probably not completely ready for merging, but considering the size it's probably worth getting some feedback now. All the ecosystem checks seem like true positives, there might still be some false negatives, but those will be harder to find.

augustelalande avatar Apr 27 '24 21:04 augustelalande

@charliermarsh when you get the time please have a look at this

augustelalande avatar May 09 '24 04:05 augustelalande

Sorry for the emberassing long wait. I make this a priority for next week.

The best way to review this change is probably to compare all changes starting after the "Revert" commit. If you have any other review recommendations, let me know :)

MichaReiser avatar Aug 01 '24 20:08 MichaReiser

@MichaReiser no worries. Ya that's a good strategy, but also take a look at my PR summary at the top, I tried to give an outline of the changes I made.

augustelalande avatar Aug 01 '24 21:08 augustelalande

I rebased the commit onto main

MichaReiser avatar Aug 05 '24 10:08 MichaReiser

ibis-project/ibis (error)

I did attach a debugger and it seems that it overflows in post_process_try when analysing "ibis/ibis/backends/pyspark/__init__.py"

MichaReiser avatar Aug 05 '24 11:08 MichaReiser

@MichaReiser Ok I think I've addressed all your comments, time for second round of review.

I did have one issue I couldn't resolve. You can see it in one of the test fixtures, but essentially in the following

def if_elif_always_false():
    if False:
        return "unreachable"
    elif False:
        return "also unreachable"
    return "reachable"

the two unreachable lines get combined as a single violation

RUF014.py:21:9: RUF014 Unreachable code in `if_elif_always_false`
   |
19 |   def if_elif_always_false():
20 |       if False:
21 |           return "unreachable"
   |  _________^
22 | |     elif False:
23 | |         return "also unreachable"
   | |_________________________________^ RUF014
24 |       return "reachable"

this is because I combine blocks which follow each other if there are no reachable blocks in between, and since the elif doesn't make its own block, I couldn't figure out how to split them.

augustelalande avatar Sep 05 '24 05:09 augustelalande

I plan to look at this PR again but I don't know yet when I have the time to do so.

MichaReiser avatar Sep 18 '24 07:09 MichaReiser