patma Adding __match_args__ is backwards incompatible

Pablo pointed out that adding this is usually backwards incompatible. Maybe we need to rethink the special casing of object(c)?

https://mail.python.org/archives/list/[email protected]/message/STJSSAETMTUY7FK5AE53IM73Z2WORNYN/

Jun 25 '20 21:06 gvanrossum

Consider #100 ? Needs no __match_args__. A default implementation made for object will allow keyword patterns without doing any pre-settings.

For the possible reason that you rejected this...

I think @viridia's idea of returning a proxy object from match that in simple cases can just be self is a winner.

With #100, by providing a utility library:

from patma import PNoKwargs, PFixedPosArg

class MyClass:
      ... # implementations

     @PFixedPosArg(1)
     @PNoKwargs
     def __match__(cls, inst):
         return inst

which is the same as

class MyClass:
    ... # implementations

   def __match__(cls, inst, argc, kwargs):
      assert not kwargs, f"pattern {cls} expects no keyword arguments, got {kwargs}."
      assert argc == 1, f"pattern {cls} expects 1 positional arguments, got {argc}."
      return (inst, )

Jun 25 '20 23:06 thautwarm

@thautwarm, please let it go. Your suggestion is off-topic, has already been rejected twice, and does not solve this issue being addressed here.

Jun 26 '20 00:06 brandtbucher

Okay, but could you please show this?:

does not solve this issue being addressed here.

Jun 26 '20 00:06 thautwarm

The current protocol is perfectly capable of accommodating this concern: we would just need to remove the nice default behavior which matches the proxy when __match_args__ is missing or None. We're trying to see if we can find a way to keep or modify the nice default behavior without breaking compatibility for classes that later implement more complicated destructuring. Your protocol suffers from the same flaw if set up with the same default behavior for object.

Please follow your own advice in the linked issue:

However I see this is not the current focus and discussions about this can be harmful in this stage. I won't talk about this any more unless higher priority issues got addressed.

Jun 26 '20 00:06 brandtbucher

Idea: drop the default behavior and instead add it back just to select standard objects like str and int.

Jun 26 '20 03:06 gvanrossum

Yeah, that could work (not a huge fan of more special cases, but I see the pragmatism). I think no matter what, we will need to drop the default behavior.

But it's generally useful enough (and hard enough to get right as a hack) that I'd like to provide a fairly frictionless way of getting it back for user-defined classes, if it isn't too hard.

Perhaps __match_args__ = ... (Ellipsis)? I can't really think of any other magic constants we could use here. It should be on __match_args__, since we probably want either having this behavior or positional matches to be mutually exclusive.

Jun 26 '20 04:06 brandtbucher

I'm not sure I understand the issue, could one of you give a brief explanation? I tried the link but it didn't send me to the correct place.

Jun 26 '20 05:06 viridia

Original comment and my response.

Jun 26 '20 05:06 brandtbucher

Basically, the default match behavior for classes (which allows a single positional match against the whole proxy) means that, as soon as a class adds __match_args__ with one or more args in the future, old code will start silently breaking in strange ways.

Jun 26 '20 05:06 brandtbucher

Got it.

I suspect that adding an explicit match_args to every built-in non-reference type is probably the cleaner answer in the long run. I think the motivation for the clever default is partly laziness (in the Perl sense), it avoids having to go and update a whole bunch of value types - but the downside is that you now have a large blast radius for any unforeseen side effects of this new behavior. Probably better to use a scalpel...

Jun 26 '20 05:06 viridia

Clairfy: Despite of __match_args__, defining __match__ later will also break the backward compatibility.

This cannot get avoided because users can arbitrarily decide the first pos arg x for a pattern C(x) by giving a certain implementation of __match__, no matter what __match_args__ is used, if we can write C.__match__ (this observation suggests us that to keep BC, the default behaviors can be only made for classes having frozen "match" and "match_args").

I saw you all constantly talking about __match_args__, hence I think I should stress this fact here, as this is reason why my proposal cannot solve this issue, even if mine doesn't need any __match_args__.

Jun 26 '20 06:06 thautwarm

Idea: drop the default behavior and instead add it back just to select standard objects like str and int.

I agree, too, that this is probably the most reasonable approach. Pablo's example seems to highlight that this is mainly an issue with "composite" types. As long as we stick to value types for the default behaviour, I think we should be fine. And, luckily, there are not too many of those.

Jun 26 '20 06:06 Tobias-Kohn

Some analyses:

why we want a default behavior?

... aims at providing a basic, useful (but still safe) experience with pattern matching out of the box
actual problem concerning use cases?

While some users use the out-of-box support for types from existing libraries, others might change those types to have more complex matching behaviors. This causes a conflict and breaks the former's code.

(feel free to add more concerns here)
what can we do?
- keeping default behaviors for all objects is impossible in general, if the execution of pattern matching will aware the modifications of either __match_args__ or __match__.
- keeping default behaviors for objects that have immutable and readonly "match_args" and "match" can be safe. Even though by changing other attributes of such an object, the compatibility may also get lost, but I don't see this usually happens in a regular use cases. We don't consider the problems of monkey patches.
- default behavior support for larger proportion of existing code is still available, in other forms without any/radical change to the PEP.
  
  Consider why this happens in practice: some users change a class/object from an upstream library, which is also used by other users depending on the default behavior.
  
  This can be prevent by providing them handy means to create wrapper patterns, like
```
nparray = register(np.array, __match__ = __match__)
```
  As a consequence, a user in need of a complex matching for np.array, can use nparray instead of modifying np.array.__match__.

Jun 26 '20 06:06 thautwarm

For most classes the default behavior is not very useful. It works for str(x) and int(x) because those have in fact behavior where str(x) == x etc. But for user classes that behavior is not very common.

Jun 26 '20 16:06 gvanrossum

I agree that it's not correct for most classes, but still I think it makes sense for many. I'll link issue #58 since there was a short discussion on this. The trivial use case of foo(x) can of course be replaced by x := foo(), but I'm considering cases where you want to do something more complex, like numbers.Real(42) or set(.items) in the linked issue.

It's not a huge deal, but it does feel odd to restrict this magic to select built-in types. If we don't like __match_args__ = ..., what about allowing None or "." or something to appear in __match_args__, signifying that that positional argument matches the proxy? There's definitely some weird things you could do with that, but it makes getting this back really simple.

Jun 26 '20 17:06 brandtbucher

I'm willing to let this go if others feel differently, though.

Jun 26 '20 17:06 brandtbucher

I really like to idea of writing stuff like int(42) to specify that I want to have an int with the given value. This becomes rather cumbersome to express with x := int() syntax. For custom classes, however, there are no literals to do this in the first place, so x := foo() is probably fine. Moreover, I think that the actual data in custom classes is (almost) always organised in attributes, so that it makes sense to prioritise an attribute-oriented approach.

The thing that really sold me on #58 was that it just works out of the box for built-in types (where appropriate). But I can happily live without having similar behaviour (by default) for custom classes.

Jun 26 '20 19:06 Tobias-Kohn

Okay, let's kill the default behavior and build a list of classes that will support it instead. I'm tempted to just say "all types in builtins" as a rule, but maybe we want to be more nuanced? Here are all of the non-exception types:

[x] bool
[x] bytearray
[x] bytes
[ ] classmethod
[ ] complex
[x] dict
[ ] enumerate
[ ] filter
[x] float
[x] frozenset
[x] int
[x] list
[ ] map
[ ] memoryview
[ ] object
[ ] property
[ ] range
[ ] reversed
[x] set
[ ] slice
[ ] staticmethod
[x] str
[ ] super
[x] tuple
[ ] type
[ ] zip

Jun 26 '20 22:06 brandtbucher

That looks right.

Jun 26 '20 23:06 gvanrossum

I imagine we raise ImpossibleMatchError if a subclass of these also defines __match_args__?

Jun 26 '20 23:06 brandtbucher

Why?

Jun 26 '20 23:06 gvanrossum

Never mind, I see now that just using the subclass's __match_args__ is correct here.

Jun 26 '20 23:06 brandtbucher

By the way, do we still want to accept __match_args__ = None? I don't think it's any different from __match_args__ = () anymore...

We could even use it to enable the old default behavior?

Jun 26 '20 23:06 brandtbucher

You’re right, it would remove a bunch of tedious language from the PEP.

I think this is ready for an update to the PEP then?

Jun 26 '20 23:06 gvanrossum

Sorry, I updated my comment with the last line. Don't know if you saw.

Jun 26 '20 23:06 brandtbucher

Let’s not introduce a shortcut for the old default.

Jun 26 '20 23:06 gvanrossum

Okay, I'll update the PEP and implementation.

Jun 26 '20 23:06 brandtbucher

Possibly type and complex. None of the others though. (Thanks for producing that list!)

Jun 27 '20 01:06 viridia

Not type, type(t) != t. And complex has two attributes, real and imag, so that would be very confusing — does complex(0) match 0j or any complex with real==0 ?

Jun 27 '20 03:06 gvanrossum

Sorry, just to clarify: I should remove type and complex from this list (and we'll do complex.__match_args__ = ("real", "imag"))?

Funny, complex would actually be a decent use case for __match_args_required__ = 2. Oh well.

Jun 27 '20 05:06 brandtbucher

I think complex should not be given positional args at all, to avoid confusion. Same for type.

Jun 27 '20 15:06 gvanrossum

patma patma copied to clipboard

Adding __match_args__ is backwards incompatible

patma
patma copied to clipboard