drill [DISCUSSION] Use INT96 as default timestamp format in Parquet tables

Hi everyone,

i want to raise a discussion about the current behavior in drill regarding parquet timestamps.

Drill uses INT64 for timestamps and you can switch to INT96 by setting store.parquet.reader.int96_as_timestamp to true. With that its not a big problem to work with both types of parquet timestamps, but since that spark uses INT96 as default, you have to switch this configure in almost all situations, especially when working with new lakehouse architectures like deltalake and iceberg.

For spark its clearly documented that they use INT96 in all scenarios:

here for reading -> https://spark.apache.org/docs/latest/sql-data-sources-parquet.html

Some Parquet-producing systems, in particular Impala and Hive, store Timestamp into INT96. This flag tells Spark SQL to interpret INT96 data as a timestamp to provide compatibility with these systems.

here for writing-> https://spark.apache.org/docs/latest/configuration.html

Sets which Parquet timestamp type to use when Spark writes data to Parquet files. INT96 is a non-standard but commonly used timestamp type in Parquet. TIMESTAMP_MICROS is a standard timestamp type in Parquet, which stores number of microseconds from the Unix epoch. TIMESTAMP_MILLIS is also standard, but with millisecond precision, which means Spark has to truncate the microsecond portion of its timestamp value.

Of course we could advise every drill user to write its spark jobs with the configuration spark.sql.parquet.outputTimestampType to TIMESTAMP_MICROS or TIMESTAMP_MILLIS or always toggle this drill configuration after startup, but its still an additional step.

@vvysotskyi mentioned that if we would switch this default now, we would have issues with some UDF´s, so i would think it could be a topic for upcomming Drill 2.0.0 as a breaking change.

What do you think?

Feb 01 '23 12:02 Z0ltrix

I'll weigh in here. It seems that since this is user configurable, it would make sense to make that the default and fix the UDFs. We're about to release 1.21 which has a lot of major improvements, so IMHO it would be a good time to do so.

Vova, would you mind explaining how this will break UDFs? Best, -- C

On Feb 1, 2023, at 7:54 AM, Christian Pfarr @.***> wrote:

Hi everyone,

i want to raise a discussion about the current behavior in drill regarding parquet timestamps.

Drill uses INT64 for timestamps and you can switch to INT96 by setting store.parquet.reader.int96_as_timestamp to true. With that its not a big problem to work with both types of parquet timestamps, but since that spark uses INT96 as default, you have to switch this configure in almost all situations, especially when working with new lakehouse architectures like deltalake and iceberg.

For spark its clearly documented that they use INT96 in all scenarios:

here for reading -> https://spark.apache.org/docs/latest/sql-data-sources-parquet.html

Some Parquet-producing systems, in particular Impala and Hive, store Timestamp into INT96. This flag tells Spark SQL to interpret INT96 data as a timestamp to provide compatibility with these systems.

here for writing-> https://spark.apache.org/docs/latest/configuration.html

Sets which Parquet timestamp type to use when Spark writes data to Parquet files. INT96 is a non-standard but commonly used timestamp type in Parquet. TIMESTAMP_MICROS is a standard timestamp type in Parquet, which stores number of microseconds from the Unix epoch. TIMESTAMP_MILLIS is also standard, but with millisecond precision, which means Spark has to truncate the microsecond portion of its timestamp value.

Of course we could advise every drill user to write its spark jobs with the configuration spark.sql.parquet.outputTimestampType to TIMESTAMP_MICROS or TIMESTAMP_MILLIS or always toggle this drill configuration after startup, but its still an additional step.

@vvysotskyi https://github.com/vvysotskyi mentioned that if we would switch this default now, we would have issues with some UDF´s, so i would think it could be a topic for upcomming Drill 2.0.0 as a breaking change.

What do you think?

— Reply to this email directly, view it on GitHub https://github.com/apache/drill/issues/2746, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKB7PTKHEOHFBSTC433NIDWVJMIHANCNFSM6AAAAAAUNV7C5Y. You are receiving this because you are subscribed to this thread.

Feb 01 '23 13:02 cgivre

I'm a -1 at this point. Even though it has historically been a popular thing to do, the use of INT96 for timestamps is nonstandard and deprecated [1]. The Parquet project wants to discourage its ongoing use [2].

https://github.com/apache/parquet-format/pull/86/files
https://issues.apache.org/jira/browse/PARQUET-323

So I think we should leave this switched off by default and change the description of our option to say that INT96 timestamps are deprecated and we only recommend enabling them for the sake of reading legacy data.

Feb 01 '23 14:02 jnturton

@jnturton I am of your opinion. Wasn't aware of this situation.

I wonder why spark has this still as a default config.

Feb 01 '23 14:02 Z0ltrix

@Z0ltrix it looks like they were worried about compatibility with Hive and Presto when they last attempted to change.

https://github.com/apache/spark/pull/28450

I guess because Spark is used much more for ETL and Parquet writing they have to be more conservative about the compatibility of their outputs.

Feb 01 '23 14:02 jnturton

The UDF that doesn't work when this option is enabled is CONVERT_FROM(timestamp_field, 'TIMESTAMP_IMPALA'), and it doesn't work because the timestamp_field type becomes timestamp instead of binary, and it is expected behavior, no need to fix it. Details regarding this option can be found here: https://drill.apache.org/docs/parquet-format/#about-int96-support

Side note: INT96 parquet physical type is deprecated, not only the usage for timestamps.

There are a lot of cases when such data could be passed to Drill. The good news is that regardless of the option value, Drill doesn't write this type, it only reads it and interprets data as timestamp values instead of binary when the option is enabled. So I think that enabling the option (and possibly keeping behavior in sync with Spark and other tools) will only improve the user experience.

Feb 01 '23 20:02 vvysotskyi

I guess the only loss of enabling it by default, apart from not helping Parquet to push people away from it, would be worse UX for people who were using INT96 as a byte array or a very wide integer. And that's probably no one at all.

I'll update my vote to a +0.

Feb 02 '23 03:02 jnturton

drill drill copied to clipboard

[DISCUSSION] Use INT96 as default timestamp format in Parquet tables

drill
drill copied to clipboard