PyCall.jl icon indicating copy to clipboard operation
PyCall.jl copied to clipboard

Numpy arrays of `str` dtype are not converted to Arrays

Open sethaxen opened this issue 3 years ago • 2 comments

A np.array with dtype=str, when passed to Julia remains a PyObject, instead of becoming some useful Array{String}. Here's a minimal example:

julia> using PyCall

julia> py"""
       import numpy as np
       """

julia> py"""np.array(["a"], dtype=str)"""
PyObject array(['a'], dtype='<U1')

sethaxen avatar May 13 '21 21:05 sethaxen

This wouldn't be too hard to implement but requires an additional dependency:

Numpy string arrays are encoded as fixed-length Char arrays. A Vector{String} in Julia would be a vector of pointers.

We'd need something like ShortStrings.jl

PhilipVinc avatar May 24 '21 13:05 PhilipVinc

Wouldn't it be preferable to interpret the Numpy string array as an Array{Vector{UInt8}} and then access it using the bytes interface?

I could really use this functionality. What would be the best way to implement a patch?

barrettp avatar May 14 '22 23:05 barrettp