flake8-bugbear icon indicating copy to clipboard operation
flake8-bugbear copied to clipboard

Suggestion: Ignore B006 if the variable in question is immediately copied

Open eric-wieser opened this issue 3 years ago • 14 comments

I would argue that these two functions:

def foo(x=[]):
    x = list(x)
    x.reverse()
    return x

def bar(x={}):
    x = dict(x)
    x.pop('k', None)
    return x

are far more readable than the way B006 forces me to write them

def foo_bad(x=None):
    if x is None:
        x = []
    x.reverse()
    return x

def bar_bad(x=None):
    if x is None:
        x = {}
    x.pop('k', None)
    return x

Furthermore, the rewrite encouraged by B006 now actually leads to a bug - x = [1]; foo_bad(x) leads to x being mutated.

Would it be possible/sensible to add an exemption to B006 along the lines of "if the only use of the mutable default is to pass into a non-mutating function call, then emit no warning"?

eric-wieser avatar Sep 15 '20 08:09 eric-wieser

Hi, thanks for the suggestion. I personally can't visualize in my head how we'd achieve that logic. If you could get a POC PR up we could consider that.

That said, I also don't get your "bug" from the rewrite. The function foo_bad will do as you ask and reverse the list. What am I missing here?

We've had LOTS of examples in production with bad "import time" defaults setting with mutables, so I always like to air on the side of caution with this one.

cooperlees avatar Sep 15 '20 15:09 cooperlees

I have just hit B006 for a library where we enforce type annotations via mypy. I have:

def whatever(self, optional_map: Mapping[str, str] = {}) -> None:
    pass

In this case, since the optional map arg is annotated as being an immutable map, I know that this is safe as no code can alter the variable. This suggests it is safe to allow mutable default if and only if they are annotated with an immutable type.

rupertnash avatar Mar 24 '21 10:03 rupertnash

B006 is there to protect you from the mutable default (in this case a dict '{}') from being initialized at your codes first pass and then at some stage mutated. This dict can be mutated and the in subsequent calls to the function, you now do not have and empty dictionary ({}) you have a truthy mutated object ({"Default": "now"}) and you could hit bugs due to this.

Generally, with the misleading function default, a developer is not expecting this. This dict is not re-initialized each time the function is called, even when no argument is supplied to the function call. That's the main protection this check is trying to protect developers from. We've hit this in production, and, taking the extra steps to be safe here has saved outages.

I feel in both your cases, if you've made the decision it's safe (I argue it's not) just edit your projects flake8 config to disable B006:

[flake8]
select = B
ignore = B006

With your type annotated argument and if you always pass a dict, then maybe don't set a default? If I am missing something please explain what I am missing in how mypy makes this safe.

cooperlees avatar Mar 24 '21 15:03 cooperlees

and then at some stage mutated

If I am missing something please explain what I am missing in how mypy makes this safe.

Mypy will detect any attempt to mutate the dict and emit an error

def evil(d : MutableMapping[str, str]):
   d[2] = True

def whatever(self, optional_map: Mapping[str, str] = {}) -> None:
    optional_map[1] = True  # mypy errors here, `Mapping` does not implement `__setitem__`.
    evil(optional_map)  # mypy errors here too, `Mapping` is not a `MutableMapping`

eric-wieser avatar Mar 24 '21 15:03 eric-wieser

Hi Cooper- thanks for the quick response. I understand (and support) the logic of the check.

Mypy performs static type checking on the program. If my function whatever above were to use the optional_map argument in a way the mutates it (including passing to a function that takes e.g. a MutableMapping) then it will complain and our CI will not allow the PR to merge to main.

rupertnash avatar Mar 24 '21 15:03 rupertnash

From an early conversation I forgot about:

The function foo_bad will do as you ask and reverse the list. What am I missing here?

In the original function I did not reverse in place:

def foo(x=[]):
    x = list(x)
    x.reverse()
    return x  # this does not modify `x`, and returns a reversed copy

a = [1, 2]
assert foo(a) == [2, 1]  # ok
assert a == [1, 2]  # ok

B006 tells me to remove the [], which could trick me into writing the following as the reader might think "ah, I've fixed the mutability problem that bugbear complained about, so I don't need the copy any more"

def foo_bad(x=None):
    if x is None:
        x = []
    x.reverse()
    return x  # whoops, this now modifies x

when what I actually have to write is the longer

def foo_ok(x=None):
    if x is None:
        x = []
    else:
        x = list(x)
    x.reverse()
    return x  # this does not modify the original `x`

eric-wieser avatar Mar 24 '21 15:03 eric-wieser

(Context: I use mypy / pyre + type annotations a lot, I've never used MutableMapping / Mapping types tho)

Few questions from this:

So what benefit does setting the default dictionary add? If you just omit that, the runtime will also make sure you're always passing a Mapping object every time this function is called? I don't see what benefit this other flow gives you ... Is this to support non standard library immutable Dicts? I feel you both have context about something I don't here.

def whatever(self, map: Mapping[str, str]) -> None:
    map[1] = True  # mypy errors + runtime if no Map is passed
    evil(map)  # ^^same
  1. @eric-wieser With your last example why don't you type annotate to get your "protection"?
def foo_ok(x: Optional[List[Any]] = None) -> List[Any]:
    if x is None:
        x = []
    x.reverse()
    return x
  1. Won't disabling via config do what you both want here? I guess the downside is it will miss other non typed functions - but won't mypy also catch them (setting pending).

That said, if you can ast parse out these scenarios and prove by tests they do exactly what you describe here I'll consider merging. I don't (yet) see the value so I personally won't be spending the time adding this. I feel there are many alternatives available here.

cooperlees avatar Mar 24 '21 15:03 cooperlees

So what benefit does setting the default dictionary add? If you just omit that, the runtime will also make sure you're always passing a Mapping object every time this function is called?

Without the = {}, the caller would have to use self.whatever({}). mypy is not a runtime, it's an analysis tool just like bugbear

With your last example why don't you type annotate to get your "protection"? The foo_ok example you provide is broken in exactly the same way as my foo_bad, as y = foo_bad(x) mutates x.

Won't disabling via config do what you both want here?

Yes; I think our point is that given it has many false positives, it might be better disabled by default unless those false positives are fixed.

That said, if you can ast parse out these scenarios and prove by tests they do exactly what you describe here I'll consider merging. I don't (yet) see the value so I personally won't be spending the time adding this

Thanks - I understand that actually fixing these cases is probably hard, and even if you did see the value it would likely still not be worth the effort for you!

eric-wieser avatar Mar 24 '21 15:03 eric-wieser

  1. The benefit here is that I only have to pass the optional mapping type when I need to. I agree this is tiny for this example, but with multiple "const qualified" optional args in the function signature, this becomes a major convenience to the caller and the implementor

Mypy will then check that callers pass only types that implement the (read only) mapping protocol (e.g. builtin dict) and that the function only uses that type in a read only way.

  1. no comment

  2. This is generally a good rule, I want to keep it on. Mypy will catch some of these errors but not e.g.:

def whatever(self, map: MutableMapping[str, str] = {}) -> None:
    map["key"] ="value"

I have just taken a fork and will see if I can create something...

rupertnash avatar Mar 24 '21 15:03 rupertnash

OK - I've just had a look and the basic thing is very easy.

diff --git a/bugbear.py b/bugbear.py
index d39ff01..733bc2f 100644
--- a/bugbear.py
+++ b/bugbear.py
@@ -345,8 +345,29 @@ class BugBearVisitor(ast.NodeVisitor):
         self.errors.append(B005(node.lineno, node.col_offset))
 
     def check_for_b006(self, node):
-        for default in node.args.defaults + node.args.kw_defaults:
+        def iter_args_with_defaults(a):
+            all_args = a.posonlyargs + a.args
+            narg = len(all_args)
+            ndef = len(a.defaults)
+            assert narg >= ndef
+            # First n args have no default
+            n = narg - ndef
+            return itertools.chain(
+                zip(all_args[n:], a.defaults),
+                zip(a.kwonlyargs, a.kw_defaults)
+            )
+        def canonicalise_type_annotation(ann):
+            # For generic types (e.g. Sequence[T]) only care about
+            # the generic.
+            base = ann.value if isinstance(ann, ast.Subscript) else ann
+            return list(self.compose_call_path(base))
+
+        for arg, default in iter_args_with_defaults(node.args):
             if isinstance(default, B006.mutable_literals):
+                if arg.annotation:
+                    ann = canonicalise_type_annotation(arg.annotation)
+                    if is_immutable_annotation(ann):
+                        continue
                 self.errors.append(B006(default.lineno, default.col_offset))
             elif isinstance(default, ast.Call):
                 call_path = ".".join(self.compose_call_path(default.func))

However, because we only have the AST, figuring out whether the Name of our annotation (e.g. Name(id='Mapping', ctx=Load())) is actually typing.Mapping (or, since 3.9 collections.abc.Mapping) is non-trivial... (i.e. implementation of is_immutable_annotation)

We would have to implement a reasonably complex bit of static analysis to solve this, which I don't have time for.

Perhaps the correct approach for me is to see if mypy can produce a type-aware version of B006 while disabling it here. Thanks for your thoughts.

rupertnash avatar Mar 26 '21 12:03 rupertnash

The issue is more general than "is immediately copied". The issue is whether that input argument is ever mutated.

For example, I write lots of code that looks like the following, which I'd argue is perfectly fine.

def f(..., optional: list[str] = []):
    ...
    whatever.extend(x for x in optional if x in y)

The naive B006-avoiding alternative is the following, which I'd argue is not OK as it introduces unnecessary complexity.

def f(..., optional: Optional[list[str]] = None):
    ...
    if optional:
        whatever.extend(x for x in optional if x in y)

Maybe the better alternative is to use an immutable default instead:

def f(..., optional: Iterable[str] = tuple()):
    ...
    whatever.extend(x for x in optional if x in y)

This works for foo in OP's question but for bar there's no built-in immutable mapping (yet: https://www.python.org/dev/peps/pep-0603/). In that situation, maybe it's enough to just rely on mypy instead: https://github.com/PyCQA/flake8-bugbear/issues/137#issuecomment-805921891

[In case it wasn't obvious, I'm figuring this out in real-time.]

jamesmyatt avatar Apr 15 '21 10:04 jamesmyatt

-1 on this suggestion unless it can be configured.

would point out in code review that the default arg should be None because it's much clearer than copying the mutable default, so this change would be counterproductive

ThiefMaster avatar Jun 19 '21 15:06 ThiefMaster

@ThiefMaster, if you requested that change when reviewing the code in https://github.com/PyCQA/flake8-bugbear/issues/137#issuecomment-805924849, you would be introducing a bug

eric-wieser avatar Jun 19 '21 16:06 eric-wieser

BTW, I was wrong in https://github.com/PyCQA/flake8-bugbear/issues/137#issuecomment-820303780: there is an immutable mapping. It's types.MappingProxyType and has been there since Python 3.3. And this is what bugbear is expecting you to use: https://github.com/PyCQA/flake8-bugbear/blob/63239e09ededc6efe841555376f288b115154953/tests/b006_b008.py#L20

With this, I think there's (probably) no good reason to have a mutable default argument, unless you need it to be mutable (e.g. as a cache). So you should be writing code like the following:

# Good
def f1(..., optional: Collection[Any] = tuple()):
   ...

def f2(..., optional: Mapping[str, Any] = types.MappingProxyType({})):
   ...

# Bad
def f1(..., optional: Collection[Any] = []):
   ...

def f2(..., optional: Mapping[str, Any] = {}):
   ...

jamesmyatt avatar Jun 21 '21 09:06 jamesmyatt