date icon indicating copy to clipboard operation
date copied to clipboard

Surprising `%c` behavior

Open rok opened this issue 4 years ago • 6 comments

Thanks for the great library @HowardHinnant !

I'm adding some functionality to Apache Arrow's strftime and I came across this behavior:

#include "date/tz.h"
#include <iostream>

int
main()
{
  date::zoned_time<date::days> t{"EST", date::local_days{date::literals::jan/1/2022}};
  std::string fmt = "%c %Z";
  std::ostringstream bufstream;
  bufstream.imbue(std::locale("en_GB.UTF-8"));
  date::to_stream(bufstream, fmt.c_str(), t);
  std::cout << std::move(bufstream).str() << "\n";
}

Prints:

Sat 01 Jan 2022 00:00:00 CET EST

So %Z flag returns EST as I expect but %c returns CET as the timezone. Probably because my system is set to CET but is that the expected behavior?

My machine runs Ubuntu 21.04, LC_TIME=nl_NL.UTF-8 and cat /etc/timezone returns Europe/Amsterdam but I also noticed this happened on Arrow's CI.

rok avatar Sep 21 '21 15:09 rok

Interesting. On my machine (macOS) I don't get the CET (or any other abbreviation besides the EST at the end). My best guess is that the combination of the en_GB.UTF-8, and the use of %c is producing the CET.

When parsing the format string, this library forwards %c to the stream using the std::put facility. This is because %c is the locale's date and time representation. To get better control, I recommend a flag that is not documented as the locale's anything, except maybe for the spelling of the month and weekday names. For example you could spell out exactly what you (might) want with "%a %d %b %Y %T %Z".

HowardHinnant avatar Sep 21 '21 16:09 HowardHinnant

I've tried a couple other locales (nl_NL.UTF-8, en_US.UTF-8) and I keep getting CET instead of EST. I'll do some more testing via CI in case this is only a problem with my machine.

To get better control, I recommend a flag that is not documented as the locale's anything, except maybe for the spelling of the month and weekday names. For example you could spell out exactly what you (might) want with "%a %d %b %Y %T %Z".

Thanks for the suggestion, I might just do that!

rok avatar Sep 21 '21 16:09 rok

Thanks for the suggestion, I might just do that!

Actually this wouldn't work - %c is locale dependent.

"en_US.utf-8", strftime(x, "%c", tz="EST") -> Sat 01 Jan 2022 12:00:00 AM EST "en_GB.utf-8", strftime(x, "%c", tz="EST") -> Sat 01 Jan 2022 00:00:00 EST

If this is not isolated to my machine I'll probably disable the flag for now.

rok avatar Sep 21 '21 17:09 rok

Thanks for the reply and input! :)

rok avatar Sep 21 '21 17:09 rok

Everything documented as locale dependent, is implemented by forwarding the request to std::put in order to get the effects of said locale. What std::put does with it is outside the control of this library. The locale independent stuff is implemented within date.h.

The one exception to the above rule is that if you compile with the macro ONLY_C_LOCALE=1, then everything that is locale dependent uses the "C" locale, and is implemented within date.h, never calling std::put. This can be used to workaround bugs in your vendor's implementation of std::put. But it isn't helpful if you need locales other than "C".

HowardHinnant avatar Sep 21 '21 17:09 HowardHinnant

Oh got it. That is good to know. Unfortunately we'll probably want non-C locales. I'll look if I can do something std::put problem and report back if something practical comes up. Thanks again!

rok avatar Sep 21 '21 17:09 rok