patma
patma copied to clipboard
Adding __match_args__ is backwards incompatible
Pablo pointed out that adding this is usually backwards incompatible. Maybe we need to rethink the special casing of object(c)?
https://mail.python.org/archives/list/[email protected]/message/STJSSAETMTUY7FK5AE53IM73Z2WORNYN/
Consider #100 ? Needs no __match_args__
. A default implementation made for object
will allow keyword patterns without doing any pre-settings.
For the possible reason that you rejected this...
I think @viridia's idea of returning a proxy object from match that in simple cases can just be self is a winner.
With #100, by providing a utility library:
from patma import PNoKwargs, PFixedPosArg
class MyClass:
... # implementations
@PFixedPosArg(1)
@PNoKwargs
def __match__(cls, inst):
return inst
which is the same as
class MyClass:
... # implementations
def __match__(cls, inst, argc, kwargs):
assert not kwargs, f"pattern {cls} expects no keyword arguments, got {kwargs}."
assert argc == 1, f"pattern {cls} expects 1 positional arguments, got {argc}."
return (inst, )
@thautwarm, please let it go. Your suggestion is off-topic, has already been rejected twice, and does not solve this issue being addressed here.
Okay, but could you please show this?:
does not solve this issue being addressed here.
The current protocol is perfectly capable of accommodating this concern: we would just need to remove the nice default behavior which matches the proxy when __match_args__
is missing or None
. We're trying to see if we can find a way to keep or modify the nice default behavior without breaking compatibility for classes that later implement more complicated destructuring. Your protocol suffers from the same flaw if set up with the same default behavior for object
.
Please follow your own advice in the linked issue:
However I see this is not the current focus and discussions about this can be harmful in this stage. I won't talk about this any more unless higher priority issues got addressed.
Idea: drop the default behavior and instead add it back just to select standard objects like str and int.
Yeah, that could work (not a huge fan of more special cases, but I see the pragmatism). I think no matter what, we will need to drop the default behavior.
But it's generally useful enough (and hard enough to get right as a hack) that I'd like to provide a fairly frictionless way of getting it back for user-defined classes, if it isn't too hard.
Perhaps __match_args__ = ...
(Ellipsis)? I can't really think of any other magic constants we could use here. It should be on __match_args__
, since we probably want either having this behavior or positional matches to be mutually exclusive.
I'm not sure I understand the issue, could one of you give a brief explanation? I tried the link but it didn't send me to the correct place.
Basically, the default match
behavior for classes (which allows a single positional match against the whole proxy) means that, as soon as a class adds __match_args__
with one or more args in the future, old code will start silently breaking in strange ways.
Got it.
I suspect that adding an explicit match_args to every built-in non-reference type is probably the cleaner answer in the long run. I think the motivation for the clever default is partly laziness (in the Perl sense), it avoids having to go and update a whole bunch of value types - but the downside is that you now have a large blast radius for any unforeseen side effects of this new behavior. Probably better to use a scalpel...
Clairfy:
Despite of __match_args__
, defining __match__
later will also break the backward compatibility.
This cannot get avoided because users can arbitrarily decide the first pos arg x
for a pattern C(x)
by giving a certain implementation of __match__
, no matter what __match_args__
is used, if we can write C.__match__
(this observation suggests us that to keep BC, the default behaviors can be only made for classes having frozen "match" and "match_args").
I saw you all constantly talking about __match_args__
, hence I think I should stress this fact here, as this is reason why my proposal cannot solve this issue, even if mine doesn't need any __match_args__
.
Idea: drop the default behavior and instead add it back just to select standard objects like str and int.
I agree, too, that this is probably the most reasonable approach. Pablo's example seems to highlight that this is mainly an issue with "composite" types. As long as we stick to value types for the default behaviour, I think we should be fine. And, luckily, there are not too many of those.
Some analyses:
-
why we want a default behavior?
... aims at providing a basic, useful (but still safe) experience with pattern matching out of the box
-
actual problem concerning use cases?
While some users use the out-of-box support for types from existing libraries, others might change those types to have more complex matching behaviors. This causes a conflict and breaks the former's code.
(feel free to add more concerns here)
-
what can we do?
-
keeping default behaviors for all objects is impossible in general, if the execution of pattern matching will aware the modifications of either
__match_args__
or__match__
. -
keeping default behaviors for objects that have immutable and readonly "match_args" and "match" can be safe. Even though by changing other attributes of such an object, the compatibility may also get lost, but I don't see this usually happens in a regular use cases. We don't consider the problems of monkey patches.
-
default behavior support for larger proportion of existing code is still available, in other forms without any/radical change to the PEP.
Consider why this happens in practice: some users change a class/object from an upstream library, which is also used by other users depending on the default behavior.
This can be prevent by providing them handy means to create wrapper patterns, like
nparray = register(np.array, __match__ = __match__)
As a consequence, a user in need of a complex matching for
np.array
, can usenparray
instead of modifyingnp.array.__match__
.
-
For most classes the default behavior is not very useful. It works for str(x) and int(x) because those have in fact behavior where str(x) == x etc. But for user classes that behavior is not very common.
I agree that it's not correct for most classes, but still I think it makes sense for many. I'll link issue #58 since there was a short discussion on this. The trivial use case of foo(x)
can of course be replaced by x := foo()
, but I'm considering cases where you want to do something more complex, like numbers.Real(42)
or set(.items)
in the linked issue.
It's not a huge deal, but it does feel odd to restrict this magic to select built-in types. If we don't like __match_args__ = ...
, what about allowing None
or "."
or something to appear in __match_args__
, signifying that that positional argument matches the proxy? There's definitely some weird things you could do with that, but it makes getting this back really simple.
I'm willing to let this go if others feel differently, though.
I really like to idea of writing stuff like int(42)
to specify that I want to have an int
with the given value. This becomes rather cumbersome to express with x := int()
syntax. For custom classes, however, there are no literals to do this in the first place, so x := foo()
is probably fine. Moreover, I think that the actual data in custom classes is (almost) always organised in attributes, so that it makes sense to prioritise an attribute-oriented approach.
The thing that really sold me on #58 was that it just works out of the box for built-in types (where appropriate). But I can happily live without having similar behaviour (by default) for custom classes.
Okay, let's kill the default behavior and build a list of classes that will support it instead. I'm tempted to just say "all types in builtins
" as a rule, but maybe we want to be more nuanced? Here are all of the non-exception types:
- [x]
bool
- [x]
bytearray
- [x]
bytes
- [ ]
classmethod
- [ ]
complex
- [x]
dict
- [ ]
enumerate
- [ ]
filter
- [x]
float
- [x]
frozenset
- [x]
int
- [x]
list
- [ ]
map
- [ ]
memoryview
- [ ]
object
- [ ]
property
- [ ]
range
- [ ]
reversed
- [x]
set
- [ ]
slice
- [ ]
staticmethod
- [x]
str
- [ ]
super
- [x]
tuple
- [ ]
type
- [ ]
zip
That looks right.
I imagine we raise ImpossibleMatchError
if a subclass of these also defines __match_args__
?
Why?
Never mind, I see now that just using the subclass's __match_args__
is correct here.
By the way, do we still want to accept __match_args__ = None
? I don't think it's any different from __match_args__ = ()
anymore...
We could even use it to enable the old default behavior?
You’re right, it would remove a bunch of tedious language from the PEP.
I think this is ready for an update to the PEP then?
Sorry, I updated my comment with the last line. Don't know if you saw.
Let’s not introduce a shortcut for the old default.
Okay, I'll update the PEP and implementation.
Possibly type
and complex
. None of the others though. (Thanks for producing that list!)
Not type, type(t) != t. And complex has two attributes, real and imag, so that would be very confusing — does complex(0) match 0j or any complex with real==0 ?
Sorry, just to clarify: I should remove type
and complex
from this list (and we'll do complex.__match_args__ = ("real", "imag")
)?
Funny, complex
would actually be a decent use case for __match_args_required__ = 2
. Oh well.
I think complex should not be given positional args at all, to avoid confusion. Same for type.