tzfile reader only reads 32-bit (verson 0) zoneinfo files
The output of zic includes a 32-bit legacy section and a 64-bit section that encodes the transition times as 64-bit epoch times. Currently we're only reading the first one. What we should do is read the zoneinfo header to determine how long the first section is, seek to the end of that section and see if there's a second "version 2" section, and if so read that one instead, otherwise go back and finish reading the 32-bit version.
This will have almost no real effect at the moment.
@pganssle would this issue explain why dates beyond the 2038 year for certain timezones seem to have offset discrepancies? e.g.,
for x in rrulestr('DTSTART;TZID=America/Sao_Paulo:20200101T000001 RRULE:FREQ=YEARLY;INTERVAL=1;COUNT=20').xafter(now):
print x
2020-01-01 00:00:01-02:00
2021-01-01 00:00:01-02:00
2022-01-01 00:00:01-02:00
2023-01-01 00:00:01-02:00
2024-01-01 00:00:01-02:00
2025-01-01 00:00:01-02:00
2026-01-01 00:00:01-02:00
2027-01-01 00:00:01-02:00
2028-01-01 00:00:01-02:00
2029-01-01 00:00:01-02:00
2030-01-01 00:00:01-02:00
2031-01-01 00:00:01-02:00
2032-01-01 00:00:01-02:00
2033-01-01 00:00:01-02:00
2034-01-01 00:00:01-02:00
2035-01-01 00:00:01-02:00
2036-01-01 00:00:01-02:00
2037-01-01 00:00:01-02:00
2038-01-01 00:00:01-03:00 # <-----------
2039-01-01 00:00:01-03:00
@ryanpetrello Yes. dateutil reverts to STD, pytz holds the last value, and the correct thing to do is to create something like a fallback tzstr object from the 64-bit data.
I hit this or a related bug today while working on some failing unittests. The tests used year 1900, 2000, 2100 and so on for far between dates. datetime.datetime(1901,1,1, tzinfo=dateutil.tz.gettz('Europe/Copenhagen')) returns different (and wrong) datetimes depending on the system.
# macOS
md@macos> python3 -c "import datetime, dateutil.tz; print(datetime.datetime(1901,1,1, tzinfo=dateutil.tz.gettz('Europe/Copenhagen')))"
1901-01-01 00:00:00+01:00
> zdump -v "Europe/Copenhagen" | head -n 5
Europe/Copenhagen Fri Dec 13 20:45:52 1901 UTC = Fri Dec 13 21:45:52 1901 CET isdst=0
Europe/Copenhagen Sat Dec 14 20:45:52 1901 UTC = Sat Dec 14 21:45:52 1901 CET isdst=0
Europe/Copenhagen Sun May 14 21:59:59 1916 UTC = Sun May 14 22:59:59 1916 CET isdst=0
Europe/Copenhagen Sun May 14 22:00:00 1916 UTC = Mon May 15 00:00:00 1916 CEST isdst=1
Europe/Copenhagen Sat Sep 30 20:59:59 1916 UTC = Sat Sep 30 22:59:59 1916 CEST isdst=1
# docker image based on Debian
md@docker> docker run -it python:3.7 bash
> pip3 install python-dateutil
#…
> python3 -c "import datetime, dateutil.tz; print(datetime.datetime(1901,1,1, tzinfo=dateutil.tz.gettz('Europe/Copenhagen')))"
1901-01-01 00:00:00+00:50:20
> python3 -c "import datetime
import dateutil.tz
tzinfo = dateutil.tz.gettz('Europe/Copenhagen')
for trans in tzinfo._trans_list_utc[:5]:
print(datetime.datetime.utcfromtimestamp(trans))"
1901-12-13 20:45:52 # Should have been arround 1894-01-01
1916-05-14 22:00:00
1916-09-30 21:00:00
1940-05-14 23:00:00
1942-11-02 01:00:00
> zdump -v "Europe/Copenhagen" | head
Europe/Copenhagen -9223372036854775808 = NULL
Europe/Copenhagen -9223372036854689408 = NULL
Europe/Copenhagen Tue Dec 31 23:09:39 1889 UT = Tue Dec 31 23:59:59 1889 LMT isdst=0 gmtoff=3020
Europe/Copenhagen Tue Dec 31 23:09:40 1889 UT = Wed Jan 1 00:00:00 1890 CMT isdst=0 gmtoff=3020
Europe/Copenhagen Sun Dec 31 23:09:39 1893 UT = Sun Dec 31 23:59:59 1893 CMT isdst=0 gmtoff=3020
Europe/Copenhagen Sun Dec 31 23:09:40 1893 UT = Mon Jan 1 00:09:40 1894 CET isdst=0 gmtoff=3600
Europe/Copenhagen Sun May 14 21:59:59 1916 UT = Sun May 14 22:59:59 1916 CET isdst=0 gmtoff=3600
Europe/Copenhagen Sun May 14 22:00:00 1916 UT = Mon May 15 00:00:00 1916 CEST isdst=1 gmtoff=7200
Europe/Copenhagen Sat Sep 30 20:59:59 1916 UT = Sat Sep 30 22:59:59 1916 CEST isdst=1 gmtoff=7200
Europe/Copenhagen Sat Sep 30 21:00:00 1916 UT = Sat Sep 30 22:00:00 1916 CET isdst=0 gmtoff=3600
I have created PR#1091, which should fix this issue.