trino icon indicating copy to clipboard operation
trino copied to clipboard

Support timestamp type in Iceberg migrate procedure

Open marcinsbd opened this issue 1 year ago • 13 comments

Description

Fixes https://github.com/trinodb/trino/issues/17006

Hive: It occurs that Hive takes the timestamp and adjust the timestamp to get UTC value according to the writer's TZ. It saves the timestamp in UTC and adds also the writer TZ info to the footer. During read, it reads the timestamp in UTC and adjust it according to the writer's TZ.

Trino uses the following property hive.parquet.time-zone for setting Parquet reader's and Parquet writer's timezone when dealing with Hive.

Iceberg: Trino always sets the UTC timezone for Parquet reader's and Parquet writer's timezone. In this way, it always return UTC values.

Change: In iceberg during reading Parquet file, when footer contains TZ, the TZ value is used and the timestamp are adjusted according to the TZ value.

https://github.com/trinodb/trino/issues/17785

Release notes

(x) Release notes are required, with the following suggested text:

# Iceberg
* Support `timestamp(3)` type in `migrate` procedure. ({issue}`17006`)

marcinsbd avatar May 08 '23 15:05 marcinsbd

@marcinsbd What's the current status of this PR?

ebyhr avatar May 16 '23 22:05 ebyhr

I'd love to see this PR merged as well (: I'm currently unable to migrate our most important table due to this bug.

rotem-ad avatar May 17 '23 16:05 rotem-ad

@marcinsbd please update the PR description with the information how timestamp information is handled in Hive tables converted to iceberg. especially whether "parquet timezone" setting gets applied to Iceberg.

cc @dain @electrum @jirassimok for "parquet timezone" context.

findepi avatar Jan 12 '24 08:01 findepi

@findepi, @findinpath PTAL

marcinsbd avatar Jan 29 '24 10:01 marcinsbd

@marcinsbd Could you rebase on master to resolve conflicts?

ebyhr avatar Jan 31 '24 06:01 ebyhr

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions[bot] avatar Mar 20 '24 17:03 github-actions[bot]

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions[bot] avatar May 14 '24 17:05 github-actions[bot]

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

github-actions[bot] avatar Jun 12 '24 17:06 github-actions[bot]

Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time.

github-actions[bot] avatar Jul 04 '24 17:07 github-actions[bot]

I am reopening and adding stale-ignore since this PR seems to just have fallen through the cracks. Ideally it can be rebased and merge. Fyi @ebyhr @cwsteinbach @alexjo2144

mosabua avatar Jul 04 '24 17:07 mosabua

https://github.com/trinodb/trino/pull/22781 looks very related and is now merged. @marcinsbd please rebase so that @raunaqmorarka can review here

findepi avatar Jul 26 '24 12:07 findepi

#22781 looks very related and is now merged. @marcinsbd please rebase so that @raunaqmorarka can review here

done

marcinsbd avatar Aug 06 '24 11:08 marcinsbd

Can we move this forward @ebyhr @cwsteinbach @alexjo2144 @raunaqmorarka @findepi ?

mosabua avatar Aug 26 '24 17:08 mosabua