orc
orc copied to clipboard
Example files are using legacy timezone names (US/Pacific)
The example ORC files use a timezone of US/Pacific which is no longer included in all Linux distributions. Ubuntu 24.04, for example, has moved this to a separate tzdata-legacy package. This can cause issues for ORC file readers on systems missing that legacy time zone data.
Should the example ORC files be updated to use a more current time zone name, like America/Los_Angeles?
Verifying the time zone in the stripe footers:
wget https://github.com/apache/orc/raw/refs/heads/main/examples/TestOrcFile.testDate1900.orc
orc-metadata -v TestOrcFile.testDate1900.orc
# Shows stripe footers with "timezone": "US/Pacific"
Additional context
https://bugs.launchpad.net/ubuntu/+source/tzdata/+bug/2058249 https://github.com/apache/arrow/issues/40633 https://github.com/pandas-dev/pandas/issues/56292 https://github.com/rapidsai/cudf/pull/16998#issuecomment-2400980607
Thank you for reporting, @bdice .
cc @williamhyun , @wgtmac , too.
To @bdice , according to our official Java tool, the type of column time is timestamp without timezone, isn't it?
$ orc-tools version
ORC 2.0.2
$ orc-tools meta ./examples/TestOrcFile.testDate1900.orc | grep Type
Processing data file examples/TestOrcFile.testDate1900.orc [length: 30941]
Type: struct<time:timestamp,date:date>
Please see here. Given that there is no timezone, I'm not sure if the root cause is the file.
- https://orc.apache.org/docs/types.html#timestamps
ORC includes two different forms of timestamps from the SQL world:
- Timestamp is a date and time without a time zone, which does not change based on the time zone of the reader.
- Timestamp with local time zone is a fixed instant in time, which does change based on the time zone of the reader.
Instead, it looks like the C++ library side issue because orc-metadata is based on C++ library. BTW, ORC-1481 was fixed already at Apache ORC 2.0.0. Do you mean that you hit this issue with Apache ORC 2.0+?
- https://github.com/apache/orc/pull/1587
It looks like a breaking change of timezone name from TZDB. I will take a look. cc @ffacs
Thank you so much, @wgtmac .
https://bugs.launchpad.net/ubuntu/+source/tzdata/+bug/2058249 has explained the root cause that tzdata has moved timezone files like US/Pacific to a separate tzdata-legacy library without providing symlinks by intention so it is a breaking change to legacy ORC files. At the same time, some downstream projects depending on Apache ORC C++ library uses ORC files from https://github.com/apache/orc/tree/main/examples for CI validation. These CI jobs start to fail once they upgrade to Ubuntu 24.04 which uses the new version of tzdata without tzdata-legacy installed.
IMO, we should not change TestOrcFile.testDate1900.orc as it is a good example to check if tzdata-legacy is required. One thing that I don't understand is that we have CI jobs running on Ubuntu 24.4 but they do not fail.
IMO, we should not change
TestOrcFile.testDate1900.orcas it is a good example to check iftzdata-legacyis required.
That is fine with me! I have worked around this by installing tzdata-legacy on Ubuntu 24.04. I can see the potential value here. I am okay with closing this issue with no action, if that is acceptable to others.
Another possible course of action would be to leave TestOrcFile.testDate1900.orc as-is, and update the timezone names in TestOrcFile.testDate2038.orc (currently also using US/Pacific).
2038 test file output
Using orc 2.0.2:
$ orc-metadata -v TestOrcFile.testDate2038.orc
{ "name": "TestOrcFile.testDate2038.orc",
"type": "struct<time:timestamp,date:date>",
"attributes": {},
"rows": 212000,
"stripe count": 28,
"format": "0.12", "writer version": "HIVE-8732", "software version": "ORC Java",
"compression": "zlib", "compression block": 10000,
"file length": 95787,
"content": 94762, "stripe stats": 686, "footer": 314, "postscript": 24,
"row index stride": 10000,
"user metadata": {
},
"stripes": [
{ "stripe": 0, "rows": 15000,
"offset": 3, "length": 6410,
"index": 153, "data": 6194, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 3, "length": 21 },
{ "id": 1, "column": 1, "kind": "index", "offset": 24, "length": 78 },
{ "id": 2, "column": 2, "kind": "index", "offset": 102, "length": 54 },
{ "id": 3, "column": 1, "kind": "data", "offset": 156, "length": 507 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 663, "length": 5416 },
{ "id": 5, "column": 2, "kind": "data", "offset": 6079, "length": 271 }
],
"timezone": "US/Pacific"
},
{ "stripe": 1, "rows": 5000,
"offset": 6413, "length": 2214,
"index": 76, "data": 2075, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 6413, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 6425, "length": 37 },
{ "id": 2, "column": 2, "kind": "index", "offset": 6462, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 6489, "length": 171 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 6660, "length": 1803 },
{ "id": 5, "column": 2, "kind": "data", "offset": 8463, "length": 101 }
],
"timezone": "US/Pacific"
},
{ "stripe": 2, "rows": 10000,
"offset": 8627, "length": 4321,
"index": 76, "data": 4182, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 8627, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 8639, "length": 37 },
{ "id": 2, "column": 2, "kind": "index", "offset": 8676, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 8703, "length": 340 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 9043, "length": 3608 },
{ "id": 5, "column": 2, "kind": "data", "offset": 12651, "length": 234 }
],
"timezone": "US/Pacific"
},
{ "stripe": 3, "rows": 10000,
"offset": 12948, "length": 4326,
"index": 77, "data": 4186, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 12948, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 12960, "length": 38 },
{ "id": 2, "column": 2, "kind": "index", "offset": 12998, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 13025, "length": 341 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 13366, "length": 3608 },
{ "id": 5, "column": 2, "kind": "data", "offset": 16974, "length": 237 }
],
"timezone": "US/Pacific"
},
{ "stripe": 4, "rows": 5000,
"offset": 17274, "length": 2229,
"index": 76, "data": 2090, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 17274, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 17286, "length": 37 },
{ "id": 2, "column": 2, "kind": "index", "offset": 17323, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 17350, "length": 174 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 17524, "length": 1803 },
{ "id": 5, "column": 2, "kind": "data", "offset": 19327, "length": 113 }
],
"timezone": "US/Pacific"
},
{ "stripe": 5, "rows": 10000,
"offset": 19503, "length": 4401,
"index": 77, "data": 4261, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 19503, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 19515, "length": 38 },
{ "id": 2, "column": 2, "kind": "index", "offset": 19553, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 19580, "length": 416 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 19996, "length": 3608 },
{ "id": 5, "column": 2, "kind": "data", "offset": 23604, "length": 237 }
],
"timezone": "US/Pacific"
},
{ "stripe": 6, "rows": 5000,
"offset": 23904, "length": 2268,
"index": 76, "data": 2129, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 23904, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 23916, "length": 37 },
{ "id": 2, "column": 2, "kind": "index", "offset": 23953, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 23980, "length": 210 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 24190, "length": 1803 },
{ "id": 5, "column": 2, "kind": "data", "offset": 25993, "length": 116 }
],
"timezone": "US/Pacific"
},
{ "stripe": 7, "rows": 10000,
"offset": 26172, "length": 4397,
"index": 77, "data": 4257, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 26172, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 26184, "length": 38 },
{ "id": 2, "column": 2, "kind": "index", "offset": 26222, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 26249, "length": 419 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 26668, "length": 3608 },
{ "id": 5, "column": 2, "kind": "data", "offset": 30276, "length": 230 }
],
"timezone": "US/Pacific"
},
{ "stripe": 8, "rows": 5000,
"offset": 30569, "length": 2269,
"index": 76, "data": 2130, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 30569, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 30581, "length": 37 },
{ "id": 2, "column": 2, "kind": "index", "offset": 30618, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 30645, "length": 213 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 30858, "length": 1803 },
{ "id": 5, "column": 2, "kind": "data", "offset": 32661, "length": 114 }
],
"timezone": "US/Pacific"
},
{ "stripe": 9, "rows": 10000,
"offset": 32838, "length": 4390,
"index": 77, "data": 4250, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 32838, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 32850, "length": 38 },
{ "id": 2, "column": 2, "kind": "index", "offset": 32888, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 32915, "length": 411 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 33326, "length": 3608 },
{ "id": 5, "column": 2, "kind": "data", "offset": 36934, "length": 231 }
],
"timezone": "US/Pacific"
},
{ "stripe": 10, "rows": 5000,
"offset": 37228, "length": 2268,
"index": 76, "data": 2129, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 37228, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 37240, "length": 37 },
{ "id": 2, "column": 2, "kind": "index", "offset": 37277, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 37304, "length": 211 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 37515, "length": 1803 },
{ "id": 5, "column": 2, "kind": "data", "offset": 39318, "length": 115 }
],
"timezone": "US/Pacific"
},
{ "stripe": 11, "rows": 10000,
"offset": 39496, "length": 4399,
"index": 77, "data": 4259, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 39496, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 39508, "length": 38 },
{ "id": 2, "column": 2, "kind": "index", "offset": 39546, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 39573, "length": 414 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 39987, "length": 3608 },
{ "id": 5, "column": 2, "kind": "data", "offset": 43595, "length": 237 }
],
"timezone": "US/Pacific"
},
{ "stripe": 12, "rows": 5000,
"offset": 43895, "length": 2266,
"index": 76, "data": 2127, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 43895, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 43907, "length": 37 },
{ "id": 2, "column": 2, "kind": "index", "offset": 43944, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 43971, "length": 211 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 44182, "length": 1803 },
{ "id": 5, "column": 2, "kind": "data", "offset": 45985, "length": 113 }
],
"timezone": "US/Pacific"
},
{ "stripe": 13, "rows": 10000,
"offset": 46161, "length": 4395,
"index": 77, "data": 4255, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 46161, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 46173, "length": 38 },
{ "id": 2, "column": 2, "kind": "index", "offset": 46211, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 46238, "length": 412 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 46650, "length": 3608 },
{ "id": 5, "column": 2, "kind": "data", "offset": 50258, "length": 235 }
],
"timezone": "US/Pacific"
},
{ "stripe": 14, "rows": 5000,
"offset": 50556, "length": 2267,
"index": 76, "data": 2128, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 50556, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 50568, "length": 37 },
{ "id": 2, "column": 2, "kind": "index", "offset": 50605, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 50632, "length": 211 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 50843, "length": 1803 },
{ "id": 5, "column": 2, "kind": "data", "offset": 52646, "length": 114 }
],
"timezone": "US/Pacific"
},
{ "stripe": 15, "rows": 10000,
"offset": 52823, "length": 4401,
"index": 77, "data": 4261, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 52823, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 52835, "length": 38 },
{ "id": 2, "column": 2, "kind": "index", "offset": 52873, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 52900, "length": 414 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 53314, "length": 3608 },
{ "id": 5, "column": 2, "kind": "data", "offset": 56922, "length": 239 }
],
"timezone": "US/Pacific"
},
{ "stripe": 16, "rows": 5000,
"offset": 57224, "length": 2272,
"index": 76, "data": 2133, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 57224, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 57236, "length": 37 },
{ "id": 2, "column": 2, "kind": "index", "offset": 57273, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 57300, "length": 211 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 57511, "length": 1803 },
{ "id": 5, "column": 2, "kind": "data", "offset": 59314, "length": 119 }
],
"timezone": "US/Pacific"
},
{ "stripe": 17, "rows": 10000,
"offset": 59496, "length": 4396,
"index": 76, "data": 4257, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 59496, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 59508, "length": 37 },
{ "id": 2, "column": 2, "kind": "index", "offset": 59545, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 59572, "length": 414 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 59986, "length": 3608 },
{ "id": 5, "column": 2, "kind": "data", "offset": 63594, "length": 235 }
],
"timezone": "US/Pacific"
},
{ "stripe": 18, "rows": 10000,
"offset": 63892, "length": 4399,
"index": 77, "data": 4259, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 63892, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 63904, "length": 38 },
{ "id": 2, "column": 2, "kind": "index", "offset": 63942, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 63969, "length": 416 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 64385, "length": 3608 },
{ "id": 5, "column": 2, "kind": "data", "offset": 67993, "length": 235 }
],
"timezone": "US/Pacific"
},
{ "stripe": 19, "rows": 5000,
"offset": 68291, "length": 2265,
"index": 76, "data": 2126, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 68291, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 68303, "length": 37 },
{ "id": 2, "column": 2, "kind": "index", "offset": 68340, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 68367, "length": 210 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 68577, "length": 1803 },
{ "id": 5, "column": 2, "kind": "data", "offset": 70380, "length": 113 }
],
"timezone": "US/Pacific"
},
{ "stripe": 20, "rows": 10000,
"offset": 70556, "length": 4398,
"index": 77, "data": 4258, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 70556, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 70568, "length": 38 },
{ "id": 2, "column": 2, "kind": "index", "offset": 70606, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 70633, "length": 413 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 71046, "length": 3608 },
{ "id": 5, "column": 2, "kind": "data", "offset": 74654, "length": 237 }
],
"timezone": "US/Pacific"
},
{ "stripe": 21, "rows": 5000,
"offset": 74954, "length": 2263,
"index": 76, "data": 2124, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 74954, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 74966, "length": 37 },
{ "id": 2, "column": 2, "kind": "index", "offset": 75003, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 75030, "length": 206 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 75236, "length": 1803 },
{ "id": 5, "column": 2, "kind": "data", "offset": 77039, "length": 115 }
],
"timezone": "US/Pacific"
},
{ "stripe": 22, "rows": 10000,
"offset": 77217, "length": 4403,
"index": 77, "data": 4263, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 77217, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 77229, "length": 38 },
{ "id": 2, "column": 2, "kind": "index", "offset": 77267, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 77294, "length": 417 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 77711, "length": 3608 },
{ "id": 5, "column": 2, "kind": "data", "offset": 81319, "length": 238 }
],
"timezone": "US/Pacific"
},
{ "stripe": 23, "rows": 5000,
"offset": 81620, "length": 2266,
"index": 77, "data": 2126, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 81620, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 81632, "length": 38 },
{ "id": 2, "column": 2, "kind": "index", "offset": 81670, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 81697, "length": 207 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 81904, "length": 1803 },
{ "id": 5, "column": 2, "kind": "data", "offset": 83707, "length": 116 }
],
"timezone": "US/Pacific"
},
{ "stripe": 24, "rows": 5000,
"offset": 83886, "length": 2267,
"index": 77, "data": 2127, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 83886, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 83898, "length": 38 },
{ "id": 2, "column": 2, "kind": "index", "offset": 83936, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 83963, "length": 213 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 84176, "length": 1803 },
{ "id": 5, "column": 2, "kind": "data", "offset": 85979, "length": 111 }
],
"timezone": "US/Pacific"
},
{ "stripe": 25, "rows": 5000,
"offset": 86153, "length": 2265,
"index": 76, "data": 2126, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 86153, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 86165, "length": 37 },
{ "id": 2, "column": 2, "kind": "index", "offset": 86202, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 86229, "length": 211 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 86440, "length": 1803 },
{ "id": 5, "column": 2, "kind": "data", "offset": 88243, "length": 112 }
],
"timezone": "US/Pacific"
},
{ "stripe": 26, "rows": 10000,
"offset": 88418, "length": 4399,
"index": 77, "data": 4259, "footer": 63,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 88418, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 88430, "length": 38 },
{ "id": 2, "column": 2, "kind": "index", "offset": 88468, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 88495, "length": 414 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 88909, "length": 3608 },
{ "id": 5, "column": 2, "kind": "data", "offset": 92517, "length": 237 }
],
"timezone": "US/Pacific"
},
{ "stripe": 27, "rows": 2000,
"offset": 92817, "length": 1945,
"index": 76, "data": 1808, "footer": 61,
"encodings": [
{ "column": 0, "encoding": "direct" },
{ "column": 1, "encoding": "direct rle2" },
{ "column": 2, "encoding": "direct rle2" }
],
"streams": [
{ "id": 0, "column": 0, "kind": "index", "offset": 92817, "length": 12 },
{ "id": 1, "column": 1, "kind": "index", "offset": 92829, "length": 37 },
{ "id": 2, "column": 2, "kind": "index", "offset": 92866, "length": 27 },
{ "id": 3, "column": 1, "kind": "data", "offset": 92893, "length": 89 },
{ "id": 4, "column": 1, "kind": "secondary", "offset": 92982, "length": 1661 },
{ "id": 5, "column": 2, "kind": "data", "offset": 94643, "length": 58 }
],
"timezone": "US/Pacific"
}
]
}
@bdice I think we can keep those files are they are created by legacy writers: "format": "0.12", "writer version": "HIVE-8732", "software version": "ORC Java". We can use the latest writer to generate new file with equivalent data but with new timezone names.
Thank you all. Let me close this issue because it seems that we agree that the old files should be kept in AS-IS. Feel free to make a PR for the newly proposed file.