awkward icon indicating copy to clipboard operation
awkward copied to clipboard

Awkward allows non-nullable unknown type, but Arrow doesn't

Open jpivarski opened this issue 1 year ago • 0 comments

In trying to construct a minimal reproducer, I find that they directly forbid a null type from being non-nullable:

>>> import pyarrow as pa
>>> pa.field("name", pa.null(), nullable=False)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pyarrow/types.pxi", line 2266, in pyarrow.lib.field
ValueError: A null type field may not be non-nullable

In our code, we must be getting at it some other way that bypasses this check, but it's clear what their intentions are. We do allow an UnknownType to not be inside an OptionType, so the problem is that we have a broader type system and a direct conversion makes something that Arrow doesn't consider legal.

So the right way to fix this is to wrap our EmptyArray inside UnmaskedArray on conversion to Arrow, but include enough metadata in the ExtensionType that when we convert it back, we know to remove the option-type, so that it's round-trip preserved. I'll make this an issue.

Originally posted by @jpivarski in https://github.com/scikit-hep/awkward/issues/2337#issuecomment-1482892635

jpivarski avatar Mar 24 '23 14:03 jpivarski