chrono
chrono copied to clipboard
timezone caching prevents updating the timezone
In a proprietary project we're currently locked to chrono 0.4.19 because the recent addition of caching the timezone broke our code. I have to admit that the way we propagate the change to chrono is quite hacky and that we're relying on implementation details of both glibc and chrono.
So first of: yes, the TZ variable points to a symlink which systemd changes after updating the timezone. This is even the case on desktop arch linux.
So basically we're using dbus to listen for timezone changes that come from systemd-timesyncd.
Once that happened we need to make sure that localtime_r returns the updated value. The issue is that glibc caches the value, too. The difference is that they update it when the value of TZ has changed. So we can simply change the value, call tzset, change the value back, call tzset again.
That hack basically works for C and worked for chrono too because it always called localtime_r. Since that's not the case anymore, the change doesn't propagate to chrono anymore. So basically we need a way to propagate a timezone change to chrono.
Originally posted by @M1cha in https://github.com/chronotope/chrono/issues/728#issuecomment-1267035270
Replying to https://github.com/chronotope/chrono/pull/728#issuecomment-1268051610 : The issue is that if the value comes from the environment variable, it's never read again:
// we don't bother storing the contents of the environment variable in this case.
// changing the environment while the process is running is generally not reccomended
To propose a solution I'd first have to understand why we need a cache in first place. Is it really that slow to parse the timezone? If we want read and compare the TZ variable on every call, would that prevent the cache from speeding up anything?
So there's multiple things here:
- Reading the environment variable
- Reading data from disk
- Parsing a timezone definition
All of these combined I think warrant having a cache, but maybe if some of the parts aren't needed we wouldn't? Also if you want to create some benchmarks for this we might decide we don't need (as much) caching. I seem to remember maybe @esheppa did some benchmarking, but can't find it now...
I think there are a few competing issues here:
- It looks like @M1cha is using the
TZvariable rather than a/etc/localtimesymlink, so the existing caching may not apply for this - The reason why I'd introduced the quoted comment and not allowed for
TZupdating is that I'd thought (potentially incorrectly) that updating environment variables was problematic even whenchronoitself wasn't doing it - however from a re-read of the issues it looks like it is more around rust code calling libc code which internally callssetvar.
There are a couple of options here:
- If it is suitable for your use case, @M1cha you could instead use the
/etc/localtimesymlink and not set theTZvariable - Alternatively, we could introduce a check for environment variable changes, storing either the full variable or a hash of it in the
Sourceenum. Potentially this should also be behind an opt-in feature flag due to the potential issues around changing environment variables while the process is running
As another potential option available now, if you listen for the dbus messages in rust, you could use a combination of chrono, chrono-tz and iana-time-zone to get the name of the timezone, then have chono-tz parse that and use it in chrono.
| I seem to remember maybe @esheppa did some benchmarking, but can't find it now...
I don't thing I did this with #728, but I'd definitely like to do it in combination with any further changes to the caching strategy
- Alternatively, we could introduce a check for environment variable changes, storing either the full variable or a hash of it in the
Sourceenum. Potentially this should also be behind an opt-in feature flag due to the potential issues around changing environment variables while the process is running
I think this is pretty reasonable. What we've got is a decent first pass, but now that we know people do actually change the TZ variable in flight, I think we should support that use case (storing a hash sounds reasonable). I don't see why we would hide that behavior (which improves correctness at the cost of a small performance penalty) behind a use flag -- I don't think it is up to us to be all that normative in saying "you just shouldn't do that".
changing the TZ variable - while not portable - has the advantage that it would (in theory) work with both C and rust code and even multiple rust crates. It effectively updates the timezone for the whole binary. So yes, having chrono conform to that would make it easier achieve that goal.
Here's also some useful notes I extracted from our code
- glibc's localtime_r doesn't call
tzsetinternally to make it more thread-safe - glibc caches the value of
TZ- even if it's a path to a file like/etc/localtime. code - Usually,
TZis not set and a compile-time provided default-path(like/etc/localtime) is used. In that case glibc will always reload the file becauseTZstaysNULL. code
It seems that in my case I actually only have an issue during unit tests where TZ is explicitly set to a temporary path. Still, I think replicating glibcs behavior is useful for the reasons above.
After looking at #853 I had some more insights: If TZ points to a symlink(which is the case on most systems nowadays), "changing the timezone for the whole binary" is pretty limited and library-specific. As I described, my approach is to make the value of TZ change without any effect on the timezone. I simply resolve the symlink, force the library(glibc) to re-read the variable, change it back, force to re-read again.
This has a lot of flaws:
- "forced the library(glibc) to re-read the variable" is library specific and has to be done for every library you want to use. For (g)libc you can do it by calling
tzset. For the code in #853 you could (AFAICT) callLocal::now(). - This assumes that the TZ cache is shared for all threads. For chrono that is currently not the case which might be an issue.
- (The change-process not being atomic might be an issue but at first it looks like it shouldn't be)
I'm a little unsure about the conclusions to draw from your stated flaws. Do you think the solution #853 would solve your issue?
#853 would solve my specific usecase because my app is single threaded (tokio using the current_thread flavor). For other usecases, (or if I were to change that) it simply wouldn't be enough. You'd have to make the cache not use thread-local-storage.
The fact that "force re-reading TZ" is library-specific isn't an issue, it's simply inconvenient because it prevents me from creating a public crate that "simply works" no matter how you get your localtime.
Note that in the solution proposed in #853, I don't think there's a need to "force re-reading TZ": any operation involving the Local timezone will check the cache and start picking up the changed timezone.
If I understand the code correctly, that's not true for the case I described where TZ points to a symlink. timedatectl will change the symlink so updating TZ isn't necessary in first place.
@M1cha - my assumption here is that in the case where TZ points to a symlink, it will be /etc/localtime which we check first anyway? Are there alternative symlinks that it may point to?
However this style of API may suit better as you can then implement your own cache invalidation customized to your specific use case (listening to d-bus message, checking environment variable, checking symlink mtime, etc, etc): https://github.com/chronotope/chrono/pull/830#issuecomment-1287771401
RE. sharing on all threads - each thread has its own cache will will update after expiry (1 second). I think implementing an Arc<Mutex> style cache might be too heavyweight here
IMO the actual path is system-specific and this library should not rely on it other than using /etc/localtime as a fallback in case TZ is not set. Where in the code can I see the assumption that the path is /etc/localtime? I can't find it. I only see the default fallback but the existence of the variable TZ is in no way related to the possibility of whatever path being a symlink.
The Timezone API does indeed sound better to me since I wouldn't have to force-update anything and could instead simply let the code that is affected by the timezone change always read the timezone without any caching. My usecases are actually quite simple:
- I have a service that configures wireless devices (old ones, with a poorly designed protocol). It needs to tell them the current timezone. So when systemd tells me the timezone has changed I simply have to do a set-timezone-request for every device(in their respective tokio task). So if that task never uses a cache in first place I don't need to force-update anything and can just trigger those tasks.
- I have another service which has some kind of cron-job. every night at e.g. 1am local time, I have to do certain stuff. It has to be local time so it's always at night when the system is (probably) not in use. This is still being worked on so I haven't thought it through yet, but not having to deal with caching at all would probably also make this kinda trivial. I may not even have to listen for systemd-events.
Both of these usecases would be fine with simply never using caching because their codepaths don't run often so they can't suffer from bad performance.
#1457 should be able to fix this.