perl5
perl5 copied to clipboard
Fix 5.42 regression with strftime() and daylight savings
tl;dr:
Fixes #23878
I botched this in Perl 5.42. These conditional compilation statements in locale.c were just plain wrong, causing code to be skipped that should have been compiled. It only affected the few hours of the year when daylight savings time is removed, so that the hour value is repeated. We didn't have a good test for that.
gory details:
libc uses 'struct tm' to hold information about a given instant in time, containing fields for things like the year, month, hour, etc. The libc function mktime() is used to normalize the structure, adjusting, say, an input Nov 31 to be Dec 01.
One of the fields in the structure, 'is_dst', indicates if daylight savings is in effect, or whether that fact is unknown. If unknown, mktime() is supposed to calculate the answer and to change 'is_dst' accordingly. Some implementations appear to always do this calculation even when the input value says the result is known. Others appear to honor it.
Some libc implementations have extra fields in 'struct tm'.
Perl has a stripped down version of mktime(), called mini_mktime(), written by Larry Wall a long time ago. I don't know why. This crippled version ignores locale and daylight time. It also doesn't know about the extra fields in 'struct tm' that some implementations have. Nor can it be extended to know about those fields, as they are dependent on timezone and daylight time, which it deliberately doesn't consider.
The botched #ifdef's were supposed to compensate for both the extra fields in the struct and that some libc implementations always recalculate 'is_dst'.
On systems with these fields, the botched #if's caused only mini_mktime() to be called. This meant that these extra fields didn't get populated, and daylight time is never considered to be in effect. And 'is_dst' does not get changed from the input.
On systems without these fields, the regular libc mktime() would be called appropriately.
The bottom line is that for the portion of the year when daylight savings is not in effect, things mostly worked properly. The two extra fields would not be populated, so if some code were to read them, it would only get the proper values by chance. We got no failure reports of this. I attribute that to the fact that the use of these is not portable, so code wouldn't tend to use them. There are portable ways to access the information they contain.
Tests were failing for the portions of the year when daylight savings is in effect; see GH #22351. The code looked correct just reading it (not seeing the flaw in the #ifdef's), so I assumed that it was an issue in the libc implementations and instituted a workaround. (I can't now think of a platform where there hasn't been a problem with a libc with something regarding locales, so that was a reasonable assumption.)
Among other things that workaround overrode the 'is_dst' field after the call to mini_mktime(), so that the value actually passed to libc strftime() indicated that daylight is in effect.
What happens next depends on the libc strftime() implementation. It could conceivably itself call mktime() which might choose to override is_dst to be the correct value, and everything would always work. The more likely possibility is that it just takes the values in the struct as-is. Remember that those values on systems with the extra fields were calculated as if daylight savings wasn't in effect, but now we're telling strftime() to use those values as if it were in effect. This is a discrepancy. I'd have to trace through some libc implementations to understand why this discrepancy seems to not matter except at the transition time.
But the bottom line is this p.r removes that discrepancy, and causes mktime() to be called appropriately on systems where it wasn't, so strftime() should now function properly. And the workarounds are also removed.
This regression fix should go into a maintenance release.
- This set of changes requires a perldelta entry, and it is included.
@khwilliamson, this p.r. had 2 failures in CI, one on Windows, one on Linux. Could you make a smoke-me branch out of this so that we can test it on some other platforms? Thanks.
After a lot of effort (facilitated by @bram-perl, I have concluded that the i386 failure is due to a bug in its glibc.
< Return from localtime for 1741510799= tm_sec=59, tm_min=59, tm_hour=0, tm_mday=9, tm_mon=2, tm_year=125, isdst=0 gmtoff=-28800, tm_zone=PST
> Return from localtime for 1741510799= tm_sec=59, tm_min=59, tm_hour=1, tm_mday=9, tm_mon=2, tm_year=125, isdst=1 gmtoff=-25200, tm_zone=PDT
The call marked < is my 64-bit Linux box compiled with g++ The one marked > is i386 32-bit compiled with gcc. (The same results are obtained if I use gcc on my box.) Notice that the calls are passed identical inputs, but have different outputs. The i386 one improperly thinks daylight savings time is in effect, and adjusts the hour and offset from gmt accordingly.
Further, I note the man page for localtime on my box says something belied by the result, and not supported by either the C or POSIX standards. It says that daylight is set to 1 if the locale has dst in effect for any time of the year.
I'll investigate using gmtime instead to work around the libc issue here (and maybe on other platforms).
After a lot of effort (facilitated by @bram-perl, I have concluded that the i386 failure is due to a bug in its glibc.
< Return from localtime for 1741510799= tm_sec=59, tm_min=59, tm_hour=0, tm_mday=9, tm_mon=2, tm_year=125, isdst=0 gmtoff=-28800, tm_zone=PST > Return from localtime for 1741510799= tm_sec=59, tm_min=59, tm_hour=1, tm_mday=9, tm_mon=2, tm_year=125, isdst=1 gmtoff=-25200, tm_zone=PDTThe call marked
<is my 64-bit Linux box compiled with g++ The one marked>is i386 32-bit compiled with gcc. (The same results are obtained if I use gcc on my box.) Notice that the calls are passed identical inputs, but have different outputs. The i386 one improperly thinks daylight savings time is in effect, and adjusts the hour and offset from gmt accordingly.Further, I note the man page for localtime on my box says something belied by the result, and not supported by either the C or POSIX standards. It says that daylight is set to 1 if the locale has dst in effect for any time of the year.
I'll investigate using gmtime instead to work around the libc issue here (and maybe on other platforms).
Would we need to investigate this on other platforms (e.g., FreeBSD)? Or with other C-compilers (e.g., clang)? If so, can you provide specfic instructions?
What version of glibc is this? Can you get a C testcase please?
perl -V gives gnulibc_version='2.41' and libc=/lib/i386-linux-gnu/libc.so.6
@jkeenan All major platforms have bugs with their locale handling; Darwin being the worst. I have written lots of workarounds for our code. And lots of skips for our tests for particular platforms.
This one just happened to show up. There's no telling what others are out there. This bug is just for a moment in time, and could easily be an entry in their database that is wrong, and we wouldn't know what other moments in time are also problematic. I'll see if using gmtime instead of localtime fixes this; I have my doubts. In any case, this came up due to a new test case I just added. I'm doing a smoke-me now.
I earlier uploaded a .c file to reproduce the problem. I later found a bug. Here is a corrected version localtime_bug.c
Using it, I found that this libc thinks daylight time starts precisely 5 hours earlier than it actually does.
Thanks. I can't reproduce w/ glibc-2.42 (tip of release branch). I also can't spot any commits that would definitely be related.
The container failure at https://github.com/Perl/perl5/actions/runs/19575295110/job/56059927984?pr=23921 is using Debian trixie (i386) and has libc6-dev i386 2.41-12.
@eggert By chance, do you remember anything that might've fixed the testcase from https://github.com/Perl/perl5/pull/23921#issuecomment-3561166749, giving wrong isdst output like in https://github.com/Perl/perl5/pull/23921#issuecomment-3554531929?
Could this be an error with the container configuration somehow?
If so, is it feasible for someone to try this by not using the container? If it still fails, could glibc be bisected?