`np.recarray` is not properly converted
Affects: PythonCall
Describe the bug
It seems that pyconvert(Any, ::<np.recarray>) incorrectly assumes that a recarray can be wrapped as a PyArray.
julia> np = pyimport("numpy");
julia> arr = np.recarray((2,2), dtype = @py([("A", "O"), ("B", "O")]))
Python:
rec.array([[(None, None), (None, None)],
[(None, None), (None, None)]],
dtype=[('A', 'O'), ('B', 'O')])
julia> pyconvert(Any, arr)
2×2 PyArray{NamedTuple{(:A, :B), Tuple{PythonCall.Wrap.UnsafePyObject, PythonCall.Wrap.UnsafePyObject}}, 2}:
(A = UnsafePyObject(Ptr{PyObject} @0x00007ffc75ba4830), B = UnsafePyObject(Ptr{PyObject} @0x00007ffc75ba4830)) (A = UnsafePyObject(Ptr{PyObject} @0x00007ffc75ba4830), B = UnsafePyObject(Ptr{PyObject} @0x00007ffc75ba4830))
(A = UnsafePyObject(Ptr{PyObject} @0x00007ffc75ba4830), B = UnsafePyObject(Ptr{PyObject} @0x00007ffc75ba4830)) (A = UnsafePyObject(Ptr{PyObject} @0x00007ffc75ba4830), B = UnsafePyObject(Ptr{PyObject} @0x00007ffc75ba4830))
A field name can not be accessed after conversion.
julia> arr.A
Python:
array([[None, None],
[None, None]], dtype=object)
julia> pyconvert(Any, arr).A
ERROR: type PyArray has no field A
Stacktrace:
[1] getproperty(x::PyArray{NamedTuple{(:A, :B), Tuple{PythonCall.Wrap.UnsafePyObject, PythonCall.Wrap.UnsafePyObject}}, 2, true, false, NamedTuple{(:A, :B), Tuple{PythonCall.Wrap.UnsafePyObject, PythonCall.Wrap.UnsafePyObject}}}, f::Symbol)
@ Base .\Base.jl:37
[2] top-level scope
@ REPL[11]:1
If indexing before field access, it works, but it does not return a usable wrapper.
julia> pyconvert(Any, arr)[1]
(A = PythonCall.Wrap.UnsafePyObject(Ptr{PythonCall.C.PyObject} @0x00007ffc75ba4830), B = PythonCall.Wrap.UnsafePyObject(Ptr{PythonCall.C.PyObject} @0x00007ffc75ba4830))
julia> pyconvert(Any, arr)[1].A
PythonCall.Wrap.UnsafePyObject(Ptr{PythonCall.C.PyObject} @0x00007ffc75ba4830)
I think the expected behavior should be pyconvert(Any, ::<np.recarray>) returning something equivalent to a StructArray, i.e. foo[1].A and foo.A[1] are equivalent.
Environment: Julia v1.9.3 PythonCall v0.9.23
Part of the issue seems related to https://github.com/JuliaPy/PythonCall.jl/blob/main/src/Convert/pyconvert.jl#L222, where python objects following array interfaces get special treatment. In this case, however, the object has more structure than just array structure, which gets lost.
The conversion to PyArray doesn't actually lose any structure - the numpy array really is essentially just an array of named tuples. The difference is that numpy gives you a way to access the subarray corresponding to a single component of these names tuples and PyArray doesn't. No reason we couldn't support a similar interface.
A bigger issue is the presence of UnsafePyObject in the wrapped array - those ideally would be Py instead.
Yes, the maybe the structure itself is not lost, but it ambiguous in a conversion python -> julia -> python; since both recarrays and plain arrays of named tuples would become the same thing in the end. In my usecase at least, there is a need distinguish these two.