awkward
awkward copied to clipboard
Awkward allows non-nullable unknown type, but Arrow doesn't
In trying to construct a minimal reproducer, I find that they directly forbid a null type from being non-nullable:
>>> import pyarrow as pa
>>> pa.field("name", pa.null(), nullable=False)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pyarrow/types.pxi", line 2266, in pyarrow.lib.field
ValueError: A null type field may not be non-nullable
In our code, we must be getting at it some other way that bypasses this check, but it's clear what their intentions are. We do allow an UnknownType
to not be inside an OptionType
, so the problem is that we have a broader type system and a direct conversion makes something that Arrow doesn't consider legal.
So the right way to fix this is to wrap our EmptyArray
inside UnmaskedArray
on conversion to Arrow, but include enough metadata in the ExtensionType
that when we convert it back, we know to remove the option-type, so that it's round-trip preserved. I'll make this an issue.
Originally posted by @jpivarski in https://github.com/scikit-hep/awkward/issues/2337#issuecomment-1482892635