cudf icon indicating copy to clipboard operation
cudf copied to clipboard

[BUG]: Incorrect missing values in pylibcudf.Table.from_arrow on a sliced, string_view column

Open TomAugspurger opened this issue 7 months ago • 0 comments

Describe the bug

When creating a plc.Table.from_arrow on a pyarrow Table with a sliced string_view column, something seems to be off about the validity map:

Steps/Code to reproduce bug

import pyarrow as pa, pylibcudf as plc

table = pa.table({"a": pa.array(["a", None], pa.string_view())})
roundtrip = plc.interop.to_arrow(plc.interop.from_arrow(table.slice(1, 2)))
result = roundtrip.columns[0][0].as_py()
assert result is None, result

That fails

AssertionError: 

(not the best message, but result is the empty string rather than None.

Expected behavior

result should be None.

Environment overview (please complete the following information)

  • Environment location: [Bare-metal, Docker, Cloud(specify cloud provider)]
  • Method of cuDF install: [conda, Docker, or from source]
    • If method of install is [Docker], provide docker pull & docker run commands used

Environment details Please run and paste the output of the cudf/print_env.sh script here, to gather any other relevant environment details

Additional context

Maybe the root cause of https://github.com/rapidsai/cudf/issues/19148. Check back after fixing.

TomAugspurger avatar Jun 13 '25 17:06 TomAugspurger