wemake-python-styleguide
wemake-python-styleguide copied to clipboard
Forbid map/filter with lambda
Rule request
Thesis
Forbid using lambda inside of map and filter.
# bad
map(lambda x: x*2, lst)
# good
(x*2 for x in lst)
# bad
filter(lambda x: x%2, lst)
# good
(x for x in lst if x%2)
# using functions instead of lambdas is ok
map(str, lst)
str(str.islower, lst)
Reasoning
Generator expressions are good, easier to read, faster, and even shorter.
@orsinium I have one example when map/filter with lambdas is more readable. When you chain your generators.
# good
objs_iterable = filter(lambda obj: obj.date > now(), data))
objs_iterable = map(lambda obj: ..., objs_iterable)
objs_iterable = filter(lambda obj: ..., objs_iterable)
# bad
obj_iterable = (obj for obj if obj.date > now())
obj_iterable = (obj**2 for obj in obj_iterable)
obj_iterable = (obj for obj in obj_iterable if obj ...)
Here we can clearly see that we use filter and map. When using only generators one should see inside the generator to understand what's going on.
The problem with map and filter in Python is that it changes the value type.
Example: filter(List[int]) -> Iterable[int]
That's why I don't like using these functions on anything other than generators. This would be a good fit for typed-linter.
I have one example when map/filter with lambdas is more readable. When you chain your generators.
This is a great functional example but when you have such complicated constructions, I'd recommend making it in a non-functional way. For your example above:
def get_even_squares(items):
for item in items:
if not item:
continue
item = item ** 2
if item % 2:
continue
yield item
More lines but readable, testable, imperative.
And slower on bigger sequences.
Are you sure? How big? In which cases?
%timeit list(get_even_squares(range(1_000_000)))
426 ms ± 1.18 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit list(filter(lambda x: not x%2, map(lambda x: x**2, filter(lambda x: x, range(1_000_000)))))
602 ms ± 2.13 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
I'm not much in interpreter internals but I think the generator gives one stack push-pop per iteration while in the functional way we have one context per lambda, in the case above 3 stack pushes per iteration.
Well, let's take a more real-life example, where items not a sequence of ints, but a sequence of complex objects, like dataclass wrapper e.g.
Better benchmarks are welcome. I leave for the WPS team the final decision.
Let's add some long operation to function
def get_users(users):
for user in users:
if not user:
continue
user.update_groups_from_db()
if 'editor' not in user.groups:
continue
user.update_permissions_from_db()
if 'create_smth' not in user.permissions:
continue
yield user
# or
users = filter(lambda user: bool(user), users)
users = map(lambda user: user.update_groups_from_db(), users)
users = filter(lambda user: 'editor' not in user.groups, users)
users = map(lambda user: user.update_permissions_from_db(), users)
users = filter(lambda user: 'create_smth' not in user.permissions not in user.groups, users)
I don't say that lambda is good, use it everywhere. I say, that sometimes they are more suitable than generators or functions.
@AlwxSin consider using real monads instead! (just joking 🙂 )
I don't say that lambda is good, use it everywhere. I say, that sometimes they are more suitable than generators or functions.
I totally agree with this statement. There are cases in which using lambdas makes your code cleaner, more readable, and even faster. There are also cases in which using them transforms you code into a mess of complicated functional nonsense. Finding a balance between lambdas, generators, and functions, is a must to write good code, but simply forbidding their usage will not necessarily improve someone's code.
In my humble opinion, the usage of lambdas depends on the context and complexity of the code and thus forbidding it might be too strict for a big amount of developers.