hive icon indicating copy to clipboard operation
hive copied to clipboard

HIVE-28337: TestMetaStoreUtils fails for invalid timestamps

Open KiranVelumuri opened this issue 1 year ago • 5 comments

What changes were proposed in this pull request?

Timestamp represented as LocalDateTime is changed to ZonedDateTime to take time-zone into account while creating the timestamp to avoid creating invalid timestamps.

Why are the changes needed?

https://issues.apache.org/jira/browse/HIVE-28337

Does this PR introduce any user-facing change?

No

Is the change a dependency upgrade?

No

How was this patch tested?

mvn test -Dtest=TestMetaStoreUtils

Before the patch: with Random(seed: 23) in line 65(default case) image

with Random(seed: 24) in line 65 image

After the patch: image

KiranVelumuri avatar Jun 18 '24 08:06 KiranVelumuri

Nice catch~ This failed test was also found in https://github.com/apache/hive/pull/5004 and we can revert https://github.com/kgyrtkirk/hive-dev-box/pull/16 once this patch is merged. @zabetak could you also take a look?

wecharyu avatar Jun 21 '24 07:06 wecharyu

The 2 test cases mentioned in https://issues.apache.org/jira/browse/HIVE-28337 are failing for me locally in my machine with Java version 1.8.0_331. I tried running these test cases in https://github.com/apache/hive/pull/5004, but the DST related one was successful. I was investigating as to why this was happening, and now I get it seeing the previous comments that this is occurring due to JDK bug.

KiranVelumuri avatar Jun 24 '24 05:06 KiranVelumuri

The intention of the test is to ensure that conversions from/to string do not alter the value no matter the timezone because the DATE/TIMESTAMP data types should be timezone agnostic.

Changing LocalDateTime to ZonedDateTime defeats the purpose of the tests and actually hides a bug that affects the MetaStoreUtils APIs.

I left a similar comment under the JIRA ticket.

Yes, I agree with you regarding the intention and purpose of the test. I am looking into the issue and would re-work this PR taking the MetaStoreUtils APIs into account.

KiranVelumuri avatar Jun 24 '24 05:06 KiranVelumuri

I have updated the tests and the MetaStoreUtils APIs to always have Timestamp created in UTC timezone, since the timestamps during the daylight savings transition could not be represented in the local time zone.

However, there are 2 known issues for the timestamps mentioned below but it is the expected behaviour from Java's Timestamp:

  1. Timestamps with year 0000 would be converted to year 0001.
  2. Timestamps in the range [1582-10-05 00:00:00, 1582-10-14 23:59:59] (both inclusive) would have 10 days added due to the shift from Julian calendar to Gregorian calendar. (The next day after 1582-10-04 is 1582-10-15)

References: [1] https://www.ibm.com/docs/en/db2/11.5?topic=dttmddtija-date-time-timestamp-values-that-can-cause-problems-in-jdbc-sqlj-applications#imjcc_r0053436__title__5 [2] https://strasbourg.eleusal.com/en/what-happened-between-5-and-14-october-1582/

KiranVelumuri avatar Jun 25 '24 11:06 KiranVelumuri

I understand that due to bugs the tests may fail in certain environments.

If the main motivation of this PR is to avoid the failures when this happens it seems more appropriate to add conditions for detecting skipping the specific test with appropriate bug reference.

If the intention is to fix the problems revealed by the tests then more work is needed since the proposed changes cannot land as they are.

The initial motivation was to fix the test failures, which later changed to solve the problem revealed in MetaStoreUtils. Upon your comments, I have reworked my PR which now solves the problem in MetaStoreUtils and have accordingly modified TestMetaStoreUtils to validate the tests properly.

@zabetak @wecharyu Please review. Thank you.

KiranVelumuri avatar Sep 20 '24 05:09 KiranVelumuri