Patch to handle missing /etc/localtime
On Arch Linux, the documentation clearly states that /etc/localtime is not required, and in CI it's not present. Patch the timezone handling to assume UTC when /etc/localtime is not there.
This is hopefully a sane fallback for systems where /etc/localtime may not be available.
--- a/c++/src/Timezone.cc 2025-03-12 12:09:01.267309399 +0000
+++ b/c++/src/Timezone.cc 2025-03-12 12:12:12.303295722 +0000
@@ -28,6 +28,12 @@
#include <map>
#include <sstream>
+#ifndef _MSC_VER
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+#endif
+
namespace orc {
// default location of the timezone files
@@ -694,6 +700,14 @@
#ifdef _MSC_VER
return getTimezoneByName("UTC");
#else
+ // The absence of LOCAL_TIMEZONE means UTC conventionally
+ {
+ struct stat _ignored;
+ if (stat(LOCAL_TIMEZONE, &_ignored) == -1) {
+ return getTimezoneByName("UTC");
+ }
+ }
+
return getTimezoneByFilename(LOCAL_TIMEZONE);
#endif
}
Thank you for reporting, @rickysarraf .
cc @wgtmac
I'm not sure if such fallback makes sense because it may slightly produce data shift for timestamp values written by a different host. I also see that the fallback exists on Windows.
@ffacs WDYT?
I'm not sure if such fallback makes sense because it may slightly produce data shift for timestamp values written by a different host. I also see that the fallback exists on Windows.
@ffacs WDYT?
I think the fallback makes sense. Because when using the local timezone, reading on different machines may produce different results.
The problem may come from the writer which blindly uses UTC to serialize timestamp values. In terms of fallback, I think it is better to make the fallback value configurable and fail if unset.
Let me close this because the claim is not true. The typical example is the old timezone ID, US/Pacific which is currently replaced into America/Los_Angeles in the new naming scheme. Currently, Java and some OSes supports this kind of legacy identifiers as a fallback, but new OSes have only new timezone IDs in these days.
- https://data.iana.org/time-zones/tzdb/backward
The absence of LOCAL_TIMEZONE means UTC conventionally
Let me close this because the claim is not true. The typical example is the old timezone ID,
US/Pacificwhich is currently replaced intoAmerica/Los_Angelesin the new naming scheme. Currently, Java and some OSes supports this kind of legacy identifiers as a fallback, but new OSes have only new timezone IDs in these days.* https://data.iana.org/time-zones/tzdb/backward
There is a difference between /etc/localtime being a symlink to a non-existent file (which is what you described) and the /etc/localtime file missing completely (which is the situation in Arch Linux containers). What if the proposed fallback was changed to use lstat instead of stat?
Even in that case, I don't think there is a common consensus that the missing file assumes UTC. Do you have any POSIX documentation link to support your claim?
There is a difference between /etc/localtime being a symlink to a non-existent file (which is what you described) and the /etc/localtime file missing completely (which is the situation in Arch Linux containers)
Do you have any POSIX documentation link that supports yours?
There is a localtime(5) man page (part of systemd) which says:
If /etc/localtime is missing, the default "UTC" timezone is used.
Oh, my bad. Thank you for pointing out that, @lahwaacz .
I reopened. Is it only for Arch linux? What about other OSes?
systemd is not an Arch Linux project 🤷
Thanks. I confirmed that it's universal.
Could you make a PR with this, @lahwaacz ? That seems to be the only remaining concerns on this discuss thread.