cudf icon indicating copy to clipboard operation
cudf copied to clipboard

[BUG] ORC read/write is wrong in `day` values in pre-1582 datetime values

Open ttnghia opened this issue 2 years ago • 4 comments

Similar to https://github.com/rapidsai/cudf/issues/11525 that has just been fixed, I discovered new failures with ORC reader/writer.

Note that these failures are wrong days, not wrong seconds like previously reported in https://github.com/rapidsai/cudf/issues/11525.

cpu = datetime.datetime(740, 7, 19, 10, 16, 58, 929621)
gpu = datetime.datetime(740, 7, 23, 10, 16, 58, 929621)

cpu = datetime.datetime(487, 4, 15, 11, 13, 37, 361058)
gpu = datetime.datetime(487, 4, 16, 11, 13, 37, 361058)

cpu = datetime.datetime(740, 7, 19, 10, 16, 58, 929621)
gpu = datetime.datetime(740, 7, 23, 10, 16, 58, 929621)

cpu = datetime.datetime(1132, 8, 4, 22, 50, 7, 153267)
gpu = datetime.datetime(1132, 8, 11, 22, 50, 7, 153267)

cpu = datetime.datetime(1348, 1, 31, 3, 21, 2, 422651)
gpu = datetime.datetime(1348, 2, 8, 3, 21, 2, 422651)

cpu = datetime.datetime(487, 4, 15, 11, 13, 37, 361058)
gpu = datetime.datetime(487, 4, 16, 11, 13, 37, 361058)

cpu = datetime.datetime(1348, 1, 31, 3, 21, 2, 422651)
gpu = datetime.datetime(1348, 2, 8, 3, 21, 2, 422651)

cpu = datetime.datetime(1132, 8, 4, 22, 50, 7, 153267)
gpu = datetime.datetime(1132, 8, 11, 22, 50, 7, 153267)

cpu = datetime.datetime(487, 4, 15, 11, 13, 37, 361058)
gpu = datetime.datetime(487, 4, 16, 11, 13, 37, 361058)

cpu = datetime.datetime(740, 7, 19, 10, 16, 58, 929621)
gpu = datetime.datetime(740, 7, 23, 10, 16, 58, 929621)

cpu = datetime.datetime(1132, 8, 4, 22, 50, 7, 153267)
gpu = datetime.datetime(1132, 8, 11, 22, 50, 7, 153267)

cpu = datetime.datetime(1348, 1, 31, 3, 21, 2, 422651)
gpu = datetime.datetime(1348, 2, 8, 3, 21, 2, 422651)

cpu = datetime.datetime(740, 7, 19, 10, 16, 58, 929621)
gpu = datetime.datetime(740, 7, 23, 10, 16, 58, 929621)

cpu = datetime.datetime(740, 7, 19, 10, 16, 58, 929621)
gpu = datetime.datetime(740, 7, 23, 10, 16, 58, 929621)

cpu = datetime.datetime(1132, 8, 4, 22, 50, 7, 153267)
gpu = datetime.datetime(1132, 8, 11, 22, 50, 7, 153267)

cpu = datetime.datetime(740, 7, 19, 10, 16, 58, 929621)
gpu = datetime.datetime(740, 7, 23, 10, 16, 58, 929621)

cpu = datetime.datetime(487, 4, 15, 11, 13, 37, 361058)
gpu = datetime.datetime(487, 4, 16, 11, 13, 37, 361058)

cpu = datetime.datetime(740, 7, 19, 10, 16, 58, 929621)
gpu = datetime.datetime(740, 7, 23, 10, 16, 58, 929621)

cpu = datetime.datetime(1348, 1, 31, 3, 21, 2, 422651)
gpu = datetime.datetime(1348, 2, 8, 3, 21, 2, 422651)

cpu = datetime.datetime(487, 4, 15, 11, 13, 37, 361058)
gpu = datetime.datetime(487, 4, 16, 11, 13, 37, 361058)

cpu = datetime.datetime(487, 4, 15, 11, 13, 37, 361058)
gpu = datetime.datetime(487, 4, 16, 11, 13, 37, 361058)

cpu = datetime.datetime(487, 4, 15, 11, 13, 37, 361058)
gpu = datetime.datetime(487, 4, 16, 11, 13, 37, 361058)

cpu = datetime.datetime(1348, 1, 31, 3, 21, 2, 422651)
gpu = datetime.datetime(1348, 2, 8, 3, 21, 2, 422651)

cpu = datetime.datetime(740, 7, 19, 10, 16, 58, 929621)
gpu = datetime.datetime(740, 7, 23, 10, 16, 58, 929621)

cpu = datetime.datetime(833, 6, 4, 10, 18, 10, 135672)
gpu = datetime.datetime(833, 6, 8, 10, 18, 10, 135672)

cpu = datetime.datetime(1348, 1, 31, 3, 21, 2, 422651)
gpu = datetime.datetime(1348, 2, 8, 3, 21, 2, 422651)

cpu = datetime.datetime(740, 7, 19, 10, 16, 58, 929621)
gpu = datetime.datetime(740, 7, 23, 10, 16, 58, 929621)

cpu = datetime.datetime(487, 4, 15, 11, 13, 37, 361058)
gpu = datetime.datetime(487, 4, 16, 11, 13, 37, 361058)

cpu = datetime.datetime(487, 4, 15, 11, 13, 37, 361058)
gpu = datetime.datetime(487, 4, 16, 11, 13, 37, 361058)

cpu = datetime.datetime(740, 7, 19, 10, 16, 58, 929621)
gpu = datetime.datetime(740, 7, 23, 10, 16, 58, 929621)

cpu = datetime.datetime(487, 4, 15, 11, 13, 37, 361058)
gpu = datetime.datetime(487, 4, 16, 11, 13, 37, 361058)

cpu = datetime.datetime(833, 6, 4, 10, 18, 10, 135672)
gpu = datetime.datetime(833, 6, 8, 10, 18, 10, 135672)

cpu = datetime.datetime(833, 6, 4, 10, 18, 10, 135672)
gpu = datetime.datetime(833, 6, 8, 10, 18, 10, 135672)

cpu = datetime.datetime(1348, 1, 31, 3, 21, 2, 422651)
gpu = datetime.datetime(1348, 2, 8, 3, 21, 2, 422651)

cpu = datetime.datetime(833, 6, 4, 10, 18, 10, 135672)
gpu = datetime.datetime(833, 6, 8, 10, 18, 10, 135672)

cpu = datetime.datetime(1132, 8, 4, 22, 50, 7, 153267)
gpu = datetime.datetime(1132, 8, 11, 22, 50, 7, 153267)

cpu = datetime.datetime(740, 7, 19, 10, 16, 58, 929621)
gpu = datetime.datetime(740, 7, 23, 10, 16, 58, 929621)

cpu = datetime.datetime(1348, 1, 31, 3, 21, 2, 422651)
gpu = datetime.datetime(1348, 2, 8, 3, 21, 2, 422651)

cpu = datetime.datetime(1348, 1, 31, 3, 21, 2, 422651)
gpu = datetime.datetime(1348, 2, 8, 3, 21, 2, 422651)

cpu = datetime.datetime(1132, 8, 4, 22, 50, 7, 153267)
gpu = datetime.datetime(1132, 8, 11, 22, 50, 7, 153267)

cpu = datetime.datetime(487, 4, 15, 11, 13, 37, 361058)
gpu = datetime.datetime(487, 4, 16, 11, 13, 37, 361058)

cpu = datetime.datetime(1348, 1, 31, 3, 21, 2, 422651)
gpu = datetime.datetime(1348, 2, 8, 3, 21, 2, 422651)

cpu = datetime.datetime(1348, 1, 31, 3, 21, 2, 422651)
gpu = datetime.datetime(1348, 2, 8, 3, 21, 2, 422651)

cpu = datetime.datetime(833, 6, 4, 10, 18, 10, 135672)
gpu = datetime.datetime(833, 6, 8, 10, 18, 10, 135672)

cpu = datetime.datetime(740, 7, 19, 10, 16, 58, 929621)
gpu = datetime.datetime(740, 7, 23, 10, 16, 58, 929621)

cpu = datetime.datetime(487, 4, 15, 11, 13, 37, 361058)
gpu = datetime.datetime(487, 4, 16, 11, 13, 37, 361058)

cpu = datetime.datetime(833, 6, 4, 10, 18, 10, 135672)
gpu = datetime.datetime(833, 6, 8, 10, 18, 10, 135672)

cpu = datetime.datetime(1348, 1, 31, 3, 21, 2, 422651)
gpu = datetime.datetime(1348, 2, 8, 3, 21, 2, 422651)

ttnghia avatar Sep 13 '22 05:09 ttnghia

@vuule @GregoryKimball Can you test with pandas for the examples above, please? cpu =... should be the ground true.

ttnghia avatar Sep 13 '22 05:09 ttnghia

Curious, does this only happen for dates before the Julian-Gregorian calendar transition in 1582?

jlowe avatar Sep 13 '22 13:09 jlowe

Yeah the tests failed only for years before that year.

ttnghia avatar Sep 13 '22 15:09 ttnghia

Yeah the tests failed only for years before that year.

This is probably more complicated because the transition took ballpark 300 years, so really one needs to interpret the time relative to a location and government to be precise (timezone is not enough if it is only numeric because, for example, Finland has the same timezone as Moscow, but switched calendar 200 years earlier).

wence- avatar Sep 14 '22 13:09 wence-