chrono icon indicating copy to clipboard operation
chrono copied to clipboard

Split `clock` feature into `clock-now` and `clock-tz`

Open mmastrac opened this issue 2 years ago • 7 comments
trafficstars

The clock feature currently pulls in timezone features which pulls in a bunch of crates to deal with timezones (iana-time-zone, android-tz). Some of these crates pull in system libraries that negatively affect startup time for Deno, which means we need to create an inconvenient shim for getting the current UTC time: https://github.com/denoland/deno/blob/main/cli/main.rs.

It would be great if the feature could be split into smaller sub-features: one for reading the current time, and one for reading tz data, which would allow us to use Utc::now() without requiring timezone data.

mmastrac avatar Oct 27 '23 19:10 mmastrac

Sounds reasonable, want to send a PR to the 0.4.x branch?

Can you link to issues with more details on the perf impact you've seen?

djc avatar Oct 27 '23 21:10 djc

@djc Totally, we can PR that. I'll also ping some folks on our team to pull out more concrete data.

We have been running experiments for the past year on startup costs of various portions of Deno and one bit that stood out was the cost of loading various frameworks on Mac -- CoreFoundation and Security in particular. The former is pulled in via iana-time-zone on Mac. We've worked around it for now by disabling the clock feature, but it requires an awkward workaround to get the current UTC time.

mmastrac avatar Oct 27 '23 22:10 mmastrac

I built a very simple example that roughly corresponds to the additional overhead we are seeing in Deno. While the absolute number is quite small, it's about 5-10% of the work we need to hit our startup time budgets:

[package]
name = "chrono-test"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
chrono = { git = "https://github.com/mmastrac/chrono", branch = "split_clock_feature", default-features = false, features = ["clock"] }
use chrono::Utc;

fn main() {
    println!("Hello, world! {}", Utc::now());
}

with-cf is built using features = clock while no-cf is built using features = now.

11:09 $ otool -L with-cf 
with-cf:
        /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1971.0.0)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.100.3)
11:12 $ otool -L no-cf 
no-cf:
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.100.3)

Benchmarking on my local machine shows a ~0.3 ms difference (approx) in overhead from CoreFoundation on a recent-model M2 MBP:

$ hyperfine -S none -w 10 -m 100 './no-cf' './with-cf'
Benchmark 1: ./no-cf
  Time (mean ± σ):       0.9 ms ±   0.1 ms    [User: 0.3 ms, System: 0.3 ms]
  Range (min … max):     0.8 ms …   1.5 ms    3092 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 2: ./with-cf
  Time (mean ± σ):       1.2 ms ±   0.1 ms    [User: 0.6 ms, System: 0.4 ms]
  Range (min … max):     1.1 ms …   1.8 ms    1803 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Summary
  './no-cf' ran
    1.41 ± 0.14 times faster than './with-cf'

mmastrac avatar Oct 28 '23 17:10 mmastrac

Okay, and have you figured out what "the work" is that is impacting performance? I wonder if there is something here that we (in chrono or our dependencies) should be doing more lazily so you're not impacted at all.

djc avatar Oct 28 '23 20:10 djc

At least on Mac, I suspect we would need to upstream changes to the iana-time-zone crate to open the CoreFoundation framework on first use. It's likely we'll do that as well, though it's a more complex patch.

The extra time is unfortunately just what the OSX loader appears to require to load the framework. It doesn't require using a TZ -- just linking the framework is enough to pay that penalty.

I suspect (most) other platforms don't pull in link-time libraries for the current TZ so it's very likely a Mac-only issue.

mmastrac avatar Oct 29 '23 03:10 mmastrac

Hey, the time is spent by dyld loading every dynamic framework on macOS before app's main() is called. DYLD_PRINT_STATISTICS var can be used to measure time spent loading frameworks.

Ref https://medium.com/@erichoracek/how-we-cut-our-ios-apps-launch-time-in-half-with-this-one-cool-trick-7aca2011e2ea

littledivy avatar Oct 29 '23 03:10 littledivy

Can this issue be closed now that the clock feature into clock and now (#1343)?

pitdicker avatar Feb 02 '24 14:02 pitdicker