pyo3 icon indicating copy to clipboard operation
pyo3 copied to clipboard

Support for PEP 622 structural pattern matching

Open fschutt opened this issue 4 years ago • 16 comments

Just wanted to ask: New versions of Python have added support for pattern matching (PEP 622):

def make_point_3d(pt):
    match pt:
        case (x, y):
            return Point3d(x, y, 0)
        case (x, y, z):
            return Point3d(x, y, z)
        case Point2d(x, y):
            return Point3d(x, y, 0)
        case Point3d(_, _, _):
            return pt
        case _:
            raise TypeError("not a point we support")

Would it be possible to extend Pyo3 to support #[derive(FromPyObject)] for Rust enums, so that Rust can work natively with these Python enums? According to the specification, all that needs to be done is that the Python interpreter tries to look for new functions

Right now I am emulating this behaviour by having a struct with #[repr(transparent)] and a function to create the case:

#[repr(C, u8)]
enum Point {
    CaseA(PointCaseA),
    CaseB(PointCaseB),
}

#[pyclass]
#[repr(C)]
struct PointCaseA {
    x: f32,
    y: f32,
}

#[pyclass]
#[repr(C)]
struct PointCaseB {
    x: f32,
    y: f32,
    z: f32,
}

#[pyclass]
#[pyo3(name = "Point")]
#[repr(transparent)]
struct PointWrapper {
    _0: Point,
}

#[pymethods]
impl PointWrapper {
    // not the exact code, just pseudocode
    pub fn destructure(&self) -> (&'static str, PyObject) {
        let casted: &Point = unsafe { mem::transmute(&self) };
        match casted {
            CaseA(x, y) => ("CaseA", PyObject::Tuple(&[x, y]),
            CaseB(x, y, z) => ("CaseB", PyObject::Tuple(&[x, y, z]),
        }
    }
}
def make_point_3d(pt): # pt = object created by Rust
    tag, payload = pt.destructure()
    if tag == "CaseA":
        return Point3d(payload.x, payload.y, 0)
    elif tag == "CaseB":
        return Point3d(payload.x, payload.y, payload.z)
    else:
        raise TypeError("not a point we support")

It's not a very high priority, but it would be nice to have. Just wanted to ask if the PyO3 team is aware of this PEP.

fschutt avatar Jul 04 '21 09:07 fschutt

This is definitely interesting, but:

  • I'd let Python's match conventions shake out themselves for at least a release
  • better support for enums should be not only 3.10+

birkenfeld avatar Jul 04 '21 13:07 birkenfeld

TBH I was assuming that #[pyenum] #417 (and maybe even #[pyclass]) would eventually support match natively on Python 3.10. I wasn't rushing to try this until closer to 3.10's release.

(Though they should continue to work perfectly fine on earlier versions too without match. It should just be a nice added convenience.)

davidhewitt avatar Jul 04 '21 15:07 davidhewitt

Hi, I wanted to know if I could volunteer to work on this issue.

errantsky avatar Nov 27 '21 13:11 errantsky

Sure, but I am still working on fieldless enums (#2013). I think you can work on pattern matching for tuple struct before #2013 is done to avoid conflicted implementation. (It's almost done, so if you really want to work on pattern matching on enums, you can fork my branch. Just let me know when you fork it so I won't rebase that branch and screw your worrks.)

Pattern matching on enums is discussed at #417. @davidhewitt maybe you want to close this as a duplication of #417?

jovenlin0527 avatar Nov 28 '21 12:11 jovenlin0527

Thanks for the info, @b05902132. I want to work on pattern matching for tuple structs, so I have forked your branch. I would appreciate it if you gave any additional information on how I should go about implementing it.

errantsky avatar Nov 29 '21 03:11 errantsky

I'm also new to this project, and not very familiar with the codebase, but I'll try:

The details of Python pattern matching is described in PEP634. I think you may want to implement this as a class pattern. If so, you need to add a class attribute named __match_args__ using descriptors, which can be implemented by using gen_py_const, somewhat similar to this. There can only be up to one implementation for ::pyo3::class::impl_::PyClassDescriptors, so you may have to modify the existing implementation for structs. You will also need to add some getters, and implement them as descriptors, too. However, I haven't thought about how to implement that (sorry!)

Also if you want to implement pattern matching on enums, I think the best way to do is to implement a subclass to the enum for each variant. (see the discussion in #417). Then you will have to add __match_args__ for each subclass.

By the way, If you haven't worked with proc macro before, https://github.com/dtolnay/proc-macro-workshop is extremely helpful for getting yourself familiar around that.

If you have more questions, please ask on GItter, so the more experienced contributors can answer them too!

jovenlin0527 avatar Nov 29 '21 05:11 jovenlin0527

@errantsky My branch depends on #2014, and as a part of merging that PR, I'm afraid I will have to rebase my branch. I hope you didn't start any work yet 😅

jovenlin0527 avatar Nov 29 '21 08:11 jovenlin0527

@b05902132 Thanks so much for the info. I have not started any work yet, so go ahead and rebase your branch.

errantsky avatar Nov 29 '21 21:11 errantsky

Sorry it's taken me a long time to get around to replying here. The above comments all sound great.

I've been thinking a little bit on the design, I think the right solution is for PyO3 to automatically define __match_args__ for a #[pyclass] when the user defines #[new]? I'm thinking imagine this class:

#[pyclass]
struct Point2d {
    x: f64,
    y: y64,
}

#[pymethods]
impl Point2d {
    #[new]
    fn new(x: f64, y: f64) -> Self {
        Self { x, y }
    }
}

then the generated __match_args__ should be ("x", "y")? (I'm pretty much copying https://docs.python.org/3/whatsnew/3.10.html#patterns-and-classes.)

davidhewitt avatar Dec 09 '21 20:12 davidhewitt

I don't think we need __match_args__ for regular structs. We can simply match against attribute names. That's how Python matches dataclass in PEP636:

from dataclasses import dataclass

@dataclass
class Point:
    x: int
    y: int

def where_is(point):
    match point:
        case Point(x=0, y=0):
            print("Origin")
        case Point(x=0, y=y):
            print(f"Y={y}")
        case Point(x=x, y=0):
            print(f"X={x}")
        case Point():
            print("Somewhere else")
        case _:
            print("Not a point")

This syntax is also similar to Rust's:

struct Point {x : i32, y: i32}

match point {
    Point{x=0, y} => ... // Not Point2d(0, y)!
    Point{x, y} => ...
}

It's very easy to support this form of Pattern matching. When Python sees case Point(x = 0, y): f(y), it's translated to:

if isinstance(point, Point) and point.x == 0:
    y = point.y
    f(y)

So the following code already supports pattern matching (by attribute names):

#[pyclass]
struct Point{
    #[pyo3(get)]
    x: i32,
    #[pyo3(get)]
    y: i32,
}

Also, I think there are cases where it's not suitable to generate __match_args__. For example the argument in __new__ may be stored in a non-pub field, or have been consumed.

jovenlin0527 avatar Dec 10 '21 13:12 jovenlin0527

We do, however, need to support tuple structs. In Rust we match them by:

struct Point(i32, i32);
match point {
    Point(x, y) => ...
}

Therefore I think we should support the following:

match point:
    case Point(x, y):
        ...

I think this can be implemented by adding new descriptors for all tuple structs.

jovenlin0527 avatar Dec 10 '21 13:12 jovenlin0527

Right, I see, although class patterns can benefit from __match_args__ to allow positional match patterns as well as keyword ones.

I think this can be implemented by adding new descriptors for all tuple structs.

I wouldn't add this automatically; it's a breaking change in behaviour. Users of tuple structs may have non-pub fields in just the same way.

Instead I now think that the "default" match behaviour should be left for the PyO3 user to support. We could autogenerate __match_args__ and descriptors for #[pyclass(dataclass)] proposed in #1375

davidhewitt avatar Dec 14 '21 07:12 davidhewitt

I think the right solution is for PyO3 to automatically define __match_args__

I feel that this asserts that if you create two instances of a class using the same arguments, that these instances are then interchangeable. Which isn't necessarily the case.

I think it would be best to just leave this up to the user.

mejrs avatar Dec 14 '21 15:12 mejrs

Agreed. It's possibly the case that for enums and #[pyclass(dataclass)] we will want to do something special, but for general pyclasses I think the solution for now is to do nothing.

davidhewitt avatar Dec 14 '21 21:12 davidhewitt

As per https://github.com/PyO3/pyo3/pull/2065#issuecomment-1016205803 we probably need to add #[pyclass(mapping)] and #[pyclass(sequence)] to enable pyclasses to be matched as mappings or sequences.

davidhewitt avatar Feb 05 '22 16:02 davidhewitt

I took a look at adding such flags, but it seems that doing this via the C-API is only supported in non-abi3 cases. It's probably better to implement support for inheriting from collections.abc.Mapping and collections.abc.Sequence instead.

davidhewitt avatar Apr 02 '22 15:04 davidhewitt

From a quick test, looks to me like the new PyMapping::register and PySequence::register APIs enable support to match a #[pyclass] as a mapping or a sequence. I don't think there's any additional work that we need to do here.

davidhewitt avatar Aug 24 '22 05:08 davidhewitt