wemake-python-styleguide
wemake-python-styleguide copied to clipboard
Check that generator expressions are safe
Rule request
Thesis
We can create generator expression that produce a different result from what we expect.
>>> a = 1
>>> b = (a * i for i in range(5))
>>> a = 2
>>> print(list(b))
[0, 2, 4, 6, 8]
Link: https://twitter.com/asmeurer/status/1371952741728194561
Reasoning
Comprehensions like this should be safe. We should not be able to reassign values that comprehensions rely on.
Hey 👋 , I'm new to the project and I'd like to contribute. Could you assign me this issue?
Thanks, @GibranHL0!
Great. I'll make the proposal as soon as possible.
Hey @sobolevn, I got my approach to the issue. 👋 According to the PEP 289, the code uses a generator expression, which after a single usage, the memory is free. Then, we should allow the assignment of the involved variables after the generator usage, as they no longer affect it.
>>> a = 1
>>> b = (a * i for i in range(5))
>>> print(list(b))
[0, 1, 2, 3, 4]
>>> a = 2
The code works fine!
So, my solution would be:
- Identify a generator expression by the elements of an open and closing parenthesis "()", and the word for and in.
- Identify if the generator expression is instantiated or if it's assigned to another variable. If the expression is used on the same line, there is no need to check anything else. Otherwise ...
- Identify the variables that affect the behavior of the generator expression, which are before the for word.
>>> ( a * i for i in range(5))
>>> var_before_for = ["a", "i"]
- Identify the variable that is created in the generator, which is between the for and in.
>>> ( a * i for i in range(5))
>>> var_for_in = ["i"]
- Get the list of the variables that affect the behavior according to the list of variables before the for minus the variable between the for and in
>>> var_affect = ["a"]
- Check the code after the generator expression, if there is an assignment of one of the variables stored in the previously created list before the generator expression is used, we raise a violation.
I'd use a tokenize visitor and violation to implement it.
What do you think? 🤔
Looks great! But, I think that ASTVisitor is easier to work with in this case.
Sure, I was checking, and I think is a better idea to use the ASTVisitor instead of the TokenVisitor.
I'll upload the PR as soon as possible. ✌️