iceberg-python icon indicating copy to clipboard operation
iceberg-python copied to clipboard

Add support for pyarrow DurationType

Open 0x26res opened this issue 8 months ago • 4 comments

Feature Request / Improvement

Currently a pa.Schema with a pa.DurationType can't be converted to an iceberg schema.

I think it should be treated the same way as a pa.Time64Type and be mapped to a time type in iceberg.

import pyarrow as pa
import pytest
from pyiceberg.catalog import Catalog
from pyiceberg.io.pyarrow import UnsupportedPyArrowTypeException


def test_iceberg_config():
    pa_schema = pa.schema(
        [
            pa.field("timestamp", pa.timestamp("us", "UTC")),
            pa.field("time", pa.time64("us")),
            pa.field("duration", pa.duration("us")),
        ],
    )
    with pytest.raises(
        UnsupportedPyArrowTypeException,
        match=r"Column 'duration' has an unsupported type: duration\[us\]",
    ):
        Catalog._convert_schema_if_needed(pa_schema)

0x26res avatar Apr 09 '25 10:04 0x26res

@0x26res Thanks for raising this issue. From what I understand, a duration is different from a time. Could you elaborate how this would map onto time?

Fokko avatar Apr 15 '25 19:04 Fokko

I guess in python a datetime.timedelta (aka duration) is like a datetime.time, except a timedelta value can be negative and be greater than a day.

In pyarrow, there isn't this constraint. You can create a time64 that represent more than 24 hours or that is negative. In that respect duration and time64, in pyarrow, are both an int 64, which associated with its unit ("us", "ns"...) can be interpreted to a logical type.

The spec on the time in iceberg are a bit loose:

Time of day, microsecond precision, without date, timezone

I guess we can either:

  • have the library convert pa.duration64 to an iceberg time by default
  • force the user to convert their pa.duration('us') to pa.time64('us') before hand, if their happy to interpret their duration as time.
  • add support for an explicit duration type in iceberg.

0x26res avatar Apr 15 '25 21:04 0x26res

This was just formally proposed to the dev mailing list via https://docs.google.com/document/d/12ghQxWxyAhSQeZyy0IWiwJ02gTqFOgfYm8x851HZFLk/edit?tab=t.0#heading=h.rt0cvesdzsj7

I think wise to wait for this to be officially implemented before attempting to stick it into the time type

jayceslesar avatar Apr 21 '25 17:04 jayceslesar

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] avatar Nov 13 '25 00:11 github-actions[bot]

This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'

github-actions[bot] avatar Nov 27 '25 00:11 github-actions[bot]