ruff icon indicating copy to clipboard operation
ruff copied to clipboard

Expand `RUF015` rule for `list(iterable).pop(0)` idiom

Open Skylion007 opened this issue 2 years ago • 12 comments

I just saw another method of accessing the first element of an iterable / dictionary key that is very inefficient. Currently, this is not detected by RUF015 which we have enabled. We should expand RUF015 to cover this idiom. A minimal problematic example is found below.

first_element = list(iterable).pop(0)

It should be

first_element = next(iter(iterable))

which would not require iterating through the entire iterable just to retrieve the first element. Expanding the autofix to cover this would be helpful as well.

  • RUFF 0.1.8 (and all earlier versions are affected).

Skylion007 avatar Dec 18 '23 18:12 Skylion007

👍 Makes sense!

charliermarsh avatar Dec 18 '23 19:12 charliermarsh

>>> iterable = [1,2]
>>> list(iterable).pop()
2
>>> next(iter(iterable))
1
>>> 

T-256 avatar Dec 19 '23 13:12 T-256

Oh, oops, the .pop() takes the last item, not the first. We could wrap in a reversed...

charliermarsh avatar Dec 19 '23 14:12 charliermarsh

Oh whoops. Good catch, I misremembered the default arg of pop. This still applies to pop(0) though. The reversed iter next isn't bad though. I wonder if a deque could be abused to grab the last element too.

Skylion007 avatar Dec 19 '23 16:12 Skylion007

Using reversed instead of iter in the .pop() case is great no? (no need to reversed(iter(...)) I think?)

-list(iterable).pop(0)
+next(iter(iterable))

-list(iterable).pop()
+next(reversed(iterable))

UnknownPlatypus avatar Dec 19 '23 17:12 UnknownPlatypus

Oh, great point!

Skylion007 avatar Dec 19 '23 17:12 Skylion007

~Actually, one concern if the generator is very, very long and if the iterable doesn't implement __reversed__, then the next(reversed(iterable)) could take a lot of memory, right? I think~ the most efficient solution is to use reversed if it does implement this method, otherwise use deque(iterable, max_len=1).pop(), right? reversed only works on sequences not iterators so I am not sure it's a drop in replacement sadly.

a = iter(range(10))
b = reversed(a)

will return an error.

Skylion007 avatar Dec 19 '23 17:12 Skylion007

oh right, this is a tricky one. However, I'm not sure deque(iterable, max_len=1).pop() will actually be faster. On my machine it's actually slower. The iterable is probably evaluated entirely idk:

%timeit deque(range(10000), maxlen=1).pop()
88.7 µs ± 8.44 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%timeit list(range(10000)).pop()
73.8 µs ± 252 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

UnknownPlatypus avatar Dec 19 '23 18:12 UnknownPlatypus

Yeah, this is more about memory management than speed. I think we agree that thepop(0) change is definitely not controversial though.

Skylion007 avatar Dec 19 '23 20:12 Skylion007

(this should be an unsafe fix)

hauntsaninja avatar Feb 08 '24 09:02 hauntsaninja

I am a beginner in Rust and would like to contribute. Can I take this up? Going through the comments I got some idea but its not clear what is the solution.

rmad17 avatar Feb 14 '24 09:02 rmad17

I am a beginner in Rust and would like to contribute. Can I take this up?

Sure!

Going through the comments I got some idea but its not clear what is the solution.

So, currently the implementation takes in an ExprSubscript which matches the list(iterable)[0] code (AST Playground). What we want is to also look for code like list(iterable).pop(0) which is an ExprAttribute (AST Playground).

This will require updating the function signature to now take any Expr to allow for either Expr::Subscript or Expr::Attribute. You'll need to update the function body to check for the .pop(0) idiom along with the existing [0] (is_head_slice). Rest of the code should probably be the same.

The fix for this rule is already unsafe and it would help expand on the fix safety documentation of the rule with why this fix is unsafe w.r.t. the new code.

Hope this helps, feel free to ask any other questions that you might have. I'll assign this issue to you but don't feel any pressure to complete this :)

dhruvmanila avatar Feb 15 '24 04:02 dhruvmanila

Since there was no activity on this issue yet (followed it for the last 2 weeks), I stepped ahead and provided a fix in https://github.com/astral-sh/ruff/pull/10148. I hope I interpreted the inactivity on this ticket correctly and didn't steal someones chance. @dhruvmanila, @charliermarsh, do you mind reviewing the PR? Thanks a lot.

robincaloudis avatar Feb 27 '24 22:02 robincaloudis

@robincaloudis Thanks for taking it up, I was not able to look into it despite asking for the ticket to be assigned.

rmad17 avatar Feb 29 '24 02:02 rmad17