presto icon indicating copy to clipboard operation
presto copied to clipboard

Timestamp time zone handling appears broken in ORC format for 0.167-t.0.2

Open Downchuck opened this issue 8 years ago • 6 comments

Teradata Presto release appears to be deserializing the timestamp column incorrectly in ORC files with time zone information. (ORC written from Hive 1.2.1)

Note the query returns correctly in standard Presto and in Hive. The table was built via Hive, transforming the first column, a string, into a timestamp column; the local time zone of Hive/Presto is America/Los_Angeles.

Teradata Presto 0.167-t.0.2:
time         |           tm            |                    _col2
---------------------+-------------------------+---------------------------------------------
 01-30-2016-00:11:02 | 2016-01-30 08:11:02.000 | 2016-01-30 07:11:02.000 America/Los_Angeles
 01-30-2016-00:39:28 | 2016-01-30 08:39:28.000 | 2016-01-30 07:39:28.000 America/Los_Angeles


Mini cluster (0.17x)
time         |           tm            |                    _col2
---------------------+-------------------------+---------------------------------------------
 01-30-2016-00:11:02 | 2016-01-30 00:11:02.000 | 2016-01-30 00:11:02.000 America/Los_Angeles
 01-30-2016-00:39:28 | 2016-01-30 00:39:28.000 | 2016-01-30 00:39:28.000 America/Los_Angeles

presto:mesoads> select time,tm, tm at time zone 'America/Los_Angeles' from orc_table 

Downchuck avatar Jun 19 '17 20:06 Downchuck

Thanks for the bug report. We will look into that. In a mean while, the workaround may be setting session property as follows: set session legacy_timestamp = true; or setting server property: deprecated.legacy-timestamp = true This will most likely restore original prestodb/presto behaviour for date time types, which has problems of their own though.

fiedukow avatar Jun 22 '17 13:06 fiedukow

@fiedukow Confirmed, legacy_timestamp works as a workaround.

Downchuck avatar Jun 28 '17 15:06 Downchuck

@fiedukow Is this issue related with TIMESTAMP behaviour does not match sql standard #7122 ?

zz22394 avatar Sep 01 '17 05:09 zz22394

As far as I know, it's only tangentially related: this ORC bug exists both for the existing timestamp implementation and the implementation in #7122.

cawallin avatar Sep 01 '17 14:09 cawallin

Seems like we can't use any timezone but UTC -- as the hive connector needs to be set to UTC and Presto, to avoid ORC read issues for data written in UTC.

Downchuck avatar Sep 01 '17 19:09 Downchuck

@cawallin Thanks. After reading #7122 and #7480 and DATE TIME SUPPORT IN PRESTO, I think I had got your point.

zz22394 avatar Sep 07 '17 15:09 zz22394