Improve the `pyconvert` rule dispatch system
Is your feature request related to a problem? Please describe. I add a rule with a bug in the function, try a conversion, see the bug, and define a new rule with a different (hopefully less buggy) function. However, when I try the conversion again, I get the old function :cry:.
Describe the solution you'd like When tied for priority in all other respects, I'd like to get the most recently added conversion, not the least recently added conversion.
Describe alternatives you've considered Removing rules is a more robust approach, and works even if I added a new rule with an erroneously high priority, but that can be orthogonal to this feature request.
Additional context I'm happy to make a PR if this is a change we want to make. It's technically breaking, but only barely, and IIUC we are okay with more breaking releases at this time.
The rationale for the current ordering is to prevent packages from overriding rules from upstream packages - in particular the rules in PythonCall itself. For example, it is documented API behaviour that pyconvert(Any, some_list) returns a PyList.
This system is definitely not perfect, but does prevent some likely cases.
Alternatively, I could document a rule that you can't commit type piracy (you need to "own" at least one of the Julia type or Python type).
Alternatively I could disallow adding new rules for existing combinations of inputs entirely, providing a method to add multiple rules at the same time in priority order for a given combination.
Thoughts?
prevent packages from overriding rules from upstream packages
We should keep doing that, yes.
Alternatively, I could document a rule that you can't commit type piracy (you need to "own" at least one of the Julia type or Python type).
I don't think that would be enough because if package A defines pyconvert(A.MyAbstractVector, some_list) and later definitions take priority, then pyconvert(AbstractVector, some_list) will return an A.MyAbstractVector.
Here's another problem with priority determined by insertion order:
If PkgA defines
abstract type AAbstractType
PkgB defines
struct BType <: AAbstractType end
pyconvert_add_rule("builtins:int", BType, ...)
and PkgC defines
struct CType <: AAbstractType end
pyconvert_add_rule("builtins:int", CType, ...)
then the result of pyconvert(AAbstractType, py_int) is load order dependant.
Thoughts?
Maybe pyconvert_add_rule("mod:type", T, ...) should work for pyconvert(T, x), but not for pyconvert(U, x) where U <: T. For convenience, pyconvert_add_rule(...; supertypes=true) could expand to a call to pyconvert_add_rule for T and all its supertypes (equivalent to current behavior), but the requirement to own either the Julia type or the Python type would still hold for each individual call, so to set supertypes=true would require owning the python type.
Then, in the absence of piracy or of two Julia packages claiming ownership of the same Python type, precedence between packages would never be determined by insertion order, only precedence within a package. This would eliminate the load order dependence I mentioned above and also allow reversing insertion-order-priority.
I'm not sure how this would work with pyconvert(Vector{Int}, pyeval("[1,2,3]", Main)).
I don't think that would be enough because if package A defines
pyconvert(A.MyAbstractVector, some_list)and later definitions take priority, thenpyconvert(AbstractVector, some_list)will return anA.MyAbstractVector.
The alternatives I mentioned were alternative to your suggestion - the default position being earlier rules take precedence. I think that addresses this point.
then the result of
pyconvert(AAbstractType, py_int)is load order dependant.
I'm OK with the conversion being load-order dependent in that case, because there is no implied contract that pyconvert(AAbstractType, x) will return ::BType or ::CType - it only needs to return ::AAbstractType, which it does. In order for there to be a contract that it always returns ::BType say, it would need to be made by PkgA (the owner of AAbstractType), i.e. PkgA would make the rule for BType (which implies that PkgA also owns and defines BType, not PkgB).
Maybe
pyconvert_add_rule("mod:type", T, ...)should work forpyconvert(T, x), but not forpyconvert(U, x)whereU <: T.
Interesting idea.
So for example the rule pyconvert_add_rule("builtins:int", Integer, ...) would be used by pyconvert(Integer, x) or pyconvert(Any, x) but NOT by pyconvert(Int, x). This would require the user to add rules for any subtypes too - in this case probably Signed and Int and BigInt, which does add some overhead.
Also, the list of all concrete types that can be successfully converted to might not actually be known. For example, there is a generic rule for converting to any Integer type, which first converts to Int or BigInt and then converts that to the target type.
Also I'm not sure how this would even work in the face of parametric types.
For convenience,
pyconvert_add_rule(...; supertypes=true)could expand to a call topyconvert_add_ruleforTand all its supertypes.
I think you'd need an upper bound on the type, because rules are written with assumptions about the target type. You can't pass T=Any to a rule written for T<:Integer.
I flipped a subtyping expression and also conflated a few concepts in my previous proposal, I hope this partially corrected restatement makes more sense. To motivation here is still to prevent load-order dependency; while you have a valid point that the load-order dependant conversions don't have a formal guarantee, I think it would be a lot better to change that unspecified and load-order dependent behavior into consistent errors.
If I have a subtype hierarchy MyInt <: MyInteger <: NotMyNumber, then when I write pyconvert_add_rule("mod:type", MyInteger, ...), that rule could (currently) be invoked when I call pyconvert(Any, x), pyconvert(MyNumber, x), pyconvert(MyInteger, x) or pyconvert(MyInt, x).
Let's say I own MyInteger but I own niether pytype(x) nor NotMyNumber. In this case, I am okay to define that conversion rule with MyInteger and pytype(x), but should not be allowed to control what happens to pyconvert(NotMyNumber, x). To this end, I propose that the conversion rule dispatch only invoke conversion rules with an equal or more specific type (subtype) than the type passed to pyconvert.
If, on the other hand, I do own pytype(x), then I should be allowed to define the behavior of pyconvert(NotMyNumber, x). I propose adding a keyword to opt back into the current behavior and allow dispatch on all matching Julia types.
For example
julia> id = string(rand(UInt128), base=16)
"3dc54b877b9d07168e9e4814c69adec3"
julia> pyexec("""
class MyType_$id:
pass
""", Main)
julia> pyexec("""
class NotMyType_$id:
pass
""", Main)
julia> abstract type NotMyNumber end
julia> abstract type MyInteger <: NotMyNumber end
julia> struct MyInt <: MyInteger
x::Int
end
julia> my_x = pyeval("MyType_$id()", Main)
Python: <MyType_3dc54b877b9d07168e9e4814c69adec3 object at 0x117362290>
julia> not_my_x = pyeval("NotMyType_$id()", Main)
Python: <NotMyType_3dc54b877b9d07168e9e4814c69adec3 object at 0x1170546d0>
julia> t = pytype(my_x)
Python: <class 'MyType_3dc54b877b9d07168e9e4814c69adec3'>
julia> PythonCall.pyconvert_add_rule("$(t.__module__):$(t.__qualname__)", MyInteger, (_, _) -> MyInt(1), supertypes=true)
julia> t2 = pytype(not_my_x)
Python: <class 'NotMyType_3dc54b877b9d07168e9e4814c69adec3'>
julia> PythonCall.pyconvert_add_rule("$(t2.__module__):$(t2.__qualname__)", MyInteger, (_, _) -> MyInt(2))
julia> pyconvert(MyInt, my_x)
MyInt(1)
julia> pyconvert(MyInteger, my_x)
MyInt(1)
julia> pyconvert(NotMyNumber, my_x)
MyInt(1)
julia> pyconvert(Any, my_x)
MyInt(1)
julia> pyconvert(MyInt, not_my_x)
MyInt(2)
julia> pyconvert(MyInteger, not_my_x)
MyInt(2)
julia> pyconvert(NotMyNumber, not_my_x)
ERROR: cannot convert this Python 'NotMyType_3dc54b877b9d07168e9e4814c69adec3' to a Julia 'NotMyNumber'
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] macro expansion
@ ~/.julia/packages/PythonCall/qTEA1/src/convert.jl:355 [inlined]
[3] macro expansion
@ ~/.julia/packages/PythonCall/qTEA1/src/Py.jl:131 [inlined]
[4] pyconvert(#unused#::Type{Vector}, x::Py)
@ PythonCall ~/.julia/packages/PythonCall/qTEA1/src/convert.jl:370
[5] top-level scope
@ REPL[21]:1
julia> pyconvert(Any, not_my_x)
ERROR: cannot convert this Python 'NotMyType_3dc54b877b9d07168e9e4814c69adec3' to a Julia 'Any'
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] macro expansion
@ ~/.julia/packages/PythonCall/qTEA1/src/convert.jl:355 [inlined]
[3] macro expansion
@ ~/.julia/packages/PythonCall/qTEA1/src/Py.jl:131 [inlined]
[4] pyconvert(#unused#::Type{Vector}, x::Py)
@ PythonCall ~/.julia/packages/PythonCall/qTEA1/src/convert.jl:370
[5] top-level scope
@ REPL[21]:1
</details>
I like this. I need to think about it.
Also, we have diverged quite a way from the original issue.