Expand `RUF015` rule for `list(iterable).pop(0)` idiom
I just saw another method of accessing the first element of an iterable / dictionary key that is very inefficient. Currently, this is not detected by RUF015 which we have enabled. We should expand RUF015 to cover this idiom. A minimal problematic example is found below.
first_element = list(iterable).pop(0)
It should be
first_element = next(iter(iterable))
which would not require iterating through the entire iterable just to retrieve the first element. Expanding the autofix to cover this would be helpful as well.
- RUFF 0.1.8 (and all earlier versions are affected).
👍 Makes sense!
>>> iterable = [1,2]
>>> list(iterable).pop()
2
>>> next(iter(iterable))
1
>>>
Oh, oops, the .pop() takes the last item, not the first. We could wrap in a reversed...
Oh whoops. Good catch, I misremembered the default arg of pop. This still applies to pop(0) though. The reversed iter next isn't bad though. I wonder if a deque could be abused to grab the last element too.
Using reversed instead of iter in the .pop() case is great no? (no need to reversed(iter(...)) I think?)
-list(iterable).pop(0)
+next(iter(iterable))
-list(iterable).pop()
+next(reversed(iterable))
Oh, great point!
~Actually, one concern if the generator is very, very long and if the iterable doesn't implement __reversed__, then the next(reversed(iterable)) could take a lot of memory, right? I think~ the most efficient solution is to use reversed if it does implement this method, otherwise use deque(iterable, max_len=1).pop(), right? reversed only works on sequences not iterators so I am not sure it's a drop in replacement sadly.
a = iter(range(10))
b = reversed(a)
will return an error.
oh right, this is a tricky one. However, I'm not sure deque(iterable, max_len=1).pop() will actually be faster. On my machine it's actually slower. The iterable is probably evaluated entirely idk:
%timeit deque(range(10000), maxlen=1).pop()
88.7 µs ± 8.44 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%timeit list(range(10000)).pop()
73.8 µs ± 252 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
Yeah, this is more about memory management than speed. I think we agree that thepop(0) change is definitely not controversial though.
(this should be an unsafe fix)
I am a beginner in Rust and would like to contribute. Can I take this up? Going through the comments I got some idea but its not clear what is the solution.
I am a beginner in Rust and would like to contribute. Can I take this up?
Sure!
Going through the comments I got some idea but its not clear what is the solution.
So, currently the implementation takes in an ExprSubscript which matches the list(iterable)[0] code (AST Playground). What we want is to also look for code like list(iterable).pop(0) which is an ExprAttribute (AST Playground).
This will require updating the function signature to now take any Expr to allow for either Expr::Subscript or Expr::Attribute. You'll need to update the function body to check for the .pop(0) idiom along with the existing [0] (is_head_slice). Rest of the code should probably be the same.
The fix for this rule is already unsafe and it would help expand on the fix safety documentation of the rule with why this fix is unsafe w.r.t. the new code.
Hope this helps, feel free to ask any other questions that you might have. I'll assign this issue to you but don't feel any pressure to complete this :)
Since there was no activity on this issue yet (followed it for the last 2 weeks), I stepped ahead and provided a fix in https://github.com/astral-sh/ruff/pull/10148. I hope I interpreted the inactivity on this ticket correctly and didn't steal someones chance. @dhruvmanila, @charliermarsh, do you mind reviewing the PR? Thanks a lot.
@robincaloudis Thanks for taking it up, I was not able to look into it despite asking for the ticket to be assigned.