pyo3
pyo3 copied to clipboard
Support exporting Rust enums to Python (full ADTs)
This issue replaces the closed #131.
We need to provide a way to export an enum to Python.
I haven't checked, but doing the same thing as boost python might work
well, C++ enums are a fair bit different from Rust enums.
We should split this into two issues. One is supporting old school Enums and the other is supporting ADTs. I'd suggest that for now having a pleasant way to define a python Enum in rust would be what this issue should focus on and if there's any state in any of the rust enum states we say not currently supported.
Agreed with the above
I have some messy code that you can use as a reference: https://salsa.debian.org/Kazan-team/simple-soft-float/-/blob/e19291a0b8eb17e9b52c09b8e670bf0fe3244989/src/python_macros.rs
I've made this the full ADT issue and created #834 for tracking the simple case.
Cool - that works too :-)
On Wed, 25 Mar 2020 at 08:44, David Hewitt [email protected] wrote:
I've made this the full ADT issue and created #834 https://github.com/PyO3/pyo3/issues/834 for tracking the simple case.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/PyO3/pyo3/issues/417#issuecomment-603715199, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGEJCALUIVNHIZ25Q5TUYLRJG75XANCNFSM4HAWD3VA .
For full ADTs I would initially limit it to the one pyclass per variant
This is now released in 0.12 if people want to try it out.
In that case, we should close this issue no?
To clarify - in 0.12 we added #[derive(FromPyObject)] which lets you convert a Python type into a Rust enum. Having a full two-way binding analogue equivalent to #[pyclass] would be even more awesome, but it's not clear what the design would be.
Well the intial feedback seems to be that it works well enough (one less big switch statement - yay),
but if we're going to do that shouldn't we also support #[derive(IntoPy)] for enums?
I would be very open to having #[derive(IntoPy)] for enums.
I made such a comment on the FromPyObject PR, but we did agree that it feels easier to use .into_py() to handle the to-python case than it does handling the FromPyObject with an if-let chain.
Also the complication with the #[derive(IntoPy)] case is that it's not clear what Python structures everything should map to. E.g.
#[derive(IntoPy)]
enum MyEnum {
Str(String),
Foo { bar: Bar, qux: i32 },
}
it seems clear enough that MyEnum::Str would map into a Python str object, but what does MyEnum::Foo become? Maybe a dict with bar and qux keys?
Design opinions are very welcome from anyone with a use case for this.
I quite like the idea of using a dict for the MyEnum::Foo case. I'm not keen on having to create a new type for each enum variant (especially if there are a ton of variants).
Python introduced a new pattern matching syntax in 3.10, and I think if we can make Rust enum support that, this will be as close to ADT as we can get.
According to PEP634, class patterns in Python are actually just a series of isinstance check. To support that we have to create an invisible class for each variant. Going down this way needs more design work.
- If each variant has a corresponding
PyClass, then we need a way to implicitly convert that class back to the enum. We could do this by implementingFromPyObjectfor the enum, butFromPyObjectis implemented for everyPyClass, so the enum cannot be aPyClass. However if it's not aPyClass, then we can't have aPy<Enum>. pyenumwill behave a lot likepyclass, like relationships between Rustenumandstruct, and we will need to think about how to avoid code duplication.
Thanks, I see you have opened #2002 to get us moving towards this!
I think as well as isinstance there's also something to do with __match_args__? We may need to support that. I haven't played around with 3.10's pattern matching at all myself yet.
If each variant has a corresponding PyClass, then we need a way to implicitly convert that class back to the enum. We could do this by implementing FromPyObject for the enum, but FromPyObject is implemented for every PyClass, so the enum cannot be a PyClass. However if it's not a PyClass, then we can't have a Py<Enum>.
I think we definitely need to create a class for the Enum, so we can have Py<Enum>.
I'm not sure (and it may depend on Python's pattern matching logic) whether we want to have a method for each variant on the Enum class, or we want each variant to be a subclass of the original enum.
Thinking about it from a user perspective, I think we want an enum like this in Rust:
enum Point {
TwoD { x: f32, y: f32 },
ThreeD { x: f32, y: f32, z: f32 }
}
to use in Python as
point = Point.TwoD(x=1.0, y=3.4)
match point:
case Point.TwoD(x, y):
print(f"Got 2D point ({x}, {y})")
case Point.ThreeD(x, y, z):
print(f"Got 3D point ({x}, {y}, {z})")
an expanded example with what I'd expect:
pub enum MyEnum {
A,
B(),
C {},
D(i8, i16),
E {a: u8, b: String},
}
should be matchable like so:
match value:
case MyEnum.A: # A is an instance of MyEnum, just like Python enums
pass
case MyEnum.B(): # B is a subclass of MyEnum
pass
case MyEnum.C(): # C is a subclass of MyEnum
pass
case MyEnum.D(x, y): # D is a subclass of MyEnum
pass
case MyEnum.E(a=x, b=y): # E is a subclass of MyEnum
pass
If each variant has a corresponding PyClass, then we need a way to implicitly convert that class back to the enum. We could do this by implementing FromPyObject for the enum, but FromPyObject is implemented for every PyClass, so the enum cannot be a PyClass. However if it's not a PyClass, then we can't have a Py.
I think we definitely need to create a class for the
Enum, so we can havePy<Enum>.I'm not sure (and it may depend on Python's pattern matching logic) whether we want to have a method for each variant on the Enum class, or we want each variant to be a subclass of the original enum.
When we implement each Variant as a seperate class, we want them to be invisible. Specifically, when the user writes a fn in PyO3, they cannot put any Variant in their function signature. That means PyO3 will have to turn Variants into Enums for the user.
Subclassing the Enum handles this very cleanly. (Didn't realize we can do that...) If we don't subclass the Enum, we will have to write custom FromPyObject for the Enum. This leads to unnecessary complexity, especially when There are so many competing implementations of FromPyObject related to PyClass that we need to be very careful to avoid conflicting implementations.
I'm looking to implement this, and could use some guidance (primarily with design). Who would be the correct person to speak to get the design nailed down on this so I can implement it?
I think this issue contains all the existing design discussion, shall we proceed with it on here?
https://github.com/PyO3/pyo3/issues/417#issuecomment-972461897 is a good starting point for what I think the generated Python API should behave like. The main question in my mind is should it expand to a single Python type, or multiple Python types in an inheritance hierarchy?
I'm currently leaning towards single Python type, although if Rust had enum variants as first-class types I would be tempted to have a Python type per variant to match.
Having a single type would certainly be simpler to implement. The only drawback I see is an inability to do pattern matching.
Am I missing some way to do pattern matching on the python side (or even check if you have a specific variant) without having a type for each variant?
imho a separate class per variant may be the most useful, since python 3.10 added pattern matching: https://docs.python.org/3/reference/compound_stmts.html#the-match-statement
the variant classes should be inside the base enum class, so they can be accessed like so:
variant = MyEnum.Variant1(a=123, b='xyz')
assert type(variant) == MyEnum.Variant1
assert isinstance(variant, MyEnum)
Imagine that this impl existed on MyEnum:
#[pymethods]
impl MyEnum {
fn set_to_a(&mut self) {
*self = MyEnum::A;
}
}
Ideally, this is how things would work from the python side:
enum = MyEnum.B()
assert isinstance(enum, MyEnum.B)
enum.set_to_a()
assert isinstance(enum, MyEnum.A)
Is such a thing possible?
We could disallow &mut self on enum classes, but my motivating use case requires this functionality
enum = MyEnum.B() assert isinstance(enum, MyEnum.B) enum.set_to_a() assert isinstance(enum, MyEnum.A)Is such a thing possible?
yes, isinstance and issubclass checks can be overridden in python:
https://docs.python.org/3/reference/datamodel.html#customizing-instance-and-subclass-checks
this allows you to have MyEnum.B return a MyEnum object that dynamically checks which rust variant is stored in it when you use isinstance -- by having MyEnum.B.__new__ return a new MyEnum with the variant set to B.
I've looked into using __instancecheck__, but it leads to some messes when dealing with properties. How do you imagine property lookup to work?
For instance, say that you had this python code:
variant = MyEnum.Variant1(a=123, b='xyz')
variant.some_rust_method_takes_mut_self()
print(variant.a)
Some things to consider:
- Ideally accessing a field on a variant doesn't require us to do a match on the rust enum under the covers. That would not only be inefficient, but potentially difficult to implement
- We also need a way to allow setting enum variant field values
- My plan has been to have a different python subclass for each variant, with a getter and setter for each field of the variant. This doesn't play nicely with using
__instancecheck__, as the getters and setters wouldn't be correct for the underlying variant - We could simply use
__getattr__and__setattr__to allow all fields to be completely dynamic. The big drawbacks I see are that this won't work with python's pattern matching, and that it's hard to discover fields (such as in a python repl).
I'm leaning towards disallowing &mut self in python-facing methods. It sucks, but the alternatives seem less ideal
Instead of disallowing &mut self methods altogether, #[pymethods] could automatically generate a python-compatible function that takes a signature like this:
fn my_fn(&mut self, [args]) -> T {}
And generates a python-facing function like this:
fn _my_fn_py(mut self, [args]) -> (self, T) {
let result = self.my_fn([args]);
(self, result)
}
When exporting into python.
proof of concept using __instancecheck__:
https://godbolt.org/z/e1494MGzE
added matching to proof-of-concept: https://godbolt.org/z/v6aP4naMn
Another option is to write directly to self.__class__.
After any &mut self function is called, this additional set of steps would need to occur:
- Check if the variant type has changed.
- If the variant type has changed:
- Set
self.__class__to the new variant's class type - Remove all properties on self using
delattr - Use
setattrto copy all of the appropriate attributes from the new variant's class onto self - You have effectively transformed
selfinto the new variant class!
- Set
This is definitely a huge kludge, but I think is better than anything else I've come up with so far