arrow icon indicating copy to clipboard operation
arrow copied to clipboard

[Python] Reconstruct MapArray from Arrays without loss of nulls

Open ianmcook opened this issue 9 months ago • 4 comments

Describe the bug, including details regarding any error messages, version, and platform.

I have a MapArray created like this:

table = pa.table(
    {"m": [{"a": 1}, {}, None, {"b": None}]},
    schema=pa.schema([pa.field("m", pa.map_(pa.string(), pa.int64()), True)])
)
m = table["m"].chunks[0]

Notice that the value at position 2 is null:

m.to_pandas()
## 0     [(a, 1.0)]
## 1             []
## 2           None
## 3    [(b, None)]
## dtype: object

m.is_null()
## <pyarrow.lib.BooleanArray object at 0x1637cb220>
## [
##   false,
##   false,
##   true,
##   false
## ]

I am trying to reconstruct the MapArray from its constituent arrays, but when I do this, it always loses the null:

new = pa.MapArray.from_arrays(m.offsets, m.keys, m.items, m.type)

new.to_pandas()
## 0     [(a, 1.0)]
## 1             []
## 2             []
## 3    [(b, None)]
## dtype: object

new.is_null()
## <pyarrow.lib.BooleanArray object at 0x1637cb220>
## [
##   false,
##   false,
##   false,
##   false
## ]

Wrapping pa.array(..., mask=m.is_null()) around it does not help either.

Is there any way to reconstruct the MapArray and keep the null?

Component(s)

Python

ianmcook avatar May 16 '24 04:05 ianmcook