TimeZones.jl icon indicating copy to clipboard operation
TimeZones.jl copied to clipboard

Minimize allocations when unpacking `TimeZone`s from cache (updated)

Open tpgillam opened this issue 2 years ago • 6 comments

Closes https://github.com/JuliaTime/TimeZones.jl/issues/422

This is a rebase/reimplementation of #423 onto master . (FYI @nickrobinson251, I'm doing this after seeing your comment!)

I started by trying to rebase your PR properly, but in the end it was easier to cherry-pick your commits that still applied (just the tests), then re-implement the rest. I'm sorry that this results in this new PR, but wasn't sure how else to proceed.

One note -- I find that it now takes two allocations to retrieve a VariableTimeZone from the cache, rather than 1. I spent some time with the alllocation tracker trying to understand this with no luck, so I've updated the test expectation accordingly. This isn't any worse than the current behaviour on master, and in any case the main thing is that retrieving a FixedTimeZone is now allocation free :)

tpgillam avatar Nov 30 '23 17:11 tpgillam

Also, I've checked this against julia 1.10-rc1, and still see 2 allocs for constructing a VariableTimeZone unfortunately.

tpgillam avatar Nov 30 '23 18:11 tpgillam

Also, I've checked this against julia 1.10-rc1, and still see 2 allocs for constructing a VariableTimeZone unfortunately.

I tried an alternative refactor, where I had a separate cache for all the classes (taking us to a total of 3) -- see on this branch. Whilst this reduces the number of allocations (down to 1 again for the VariableTimeZone), overall the timings are significantly worse, presumably due to the additional hash-table lookup.

Microbenchmarks on my machine with Julia 1.10-rc1 for

@btime TimeZone("UTC")  # FTZ case
@btime TimeZone("America/Winnipeg")  # VTZ case

this PR

FTZ:  18.727 ns (0 allocations: 0 bytes)
VTZ:  37.680 ns (2 allocations: 96 bytes)

separate class cache

FTZ:  31.096 ns (0 allocations: 0 bytes)
VTZ:  46.151 ns (1 allocation: 48 bytes)

current master

FTZ:  48.142 ns (2 allocations: 64 bytes)
VTZ:  48.238 ns (2 allocations: 64 bytes)

tpgillam avatar Nov 30 '23 20:11 tpgillam

LGTM. I want to sit with this for a little bit

Thanks - let me know if there’s anything else from my end that would be helpful.

tpgillam avatar Nov 30 '23 21:11 tpgillam

PkgBenchmark ``` > TZDATA_VERSION=2016j julia --project=benchmark/ -e 'using PkgBenchmark, TimeZones; export_markdown(stdout, judge(TimeZones, "origin/HEAD", verbose=false))' PkgBenchmark: Running benchmarks... PkgBenchmark: using benchmark tuning data in /home/tom/.julia/dev/TimeZones/benchmark/tune.json Benchmarking 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:45 PkgBenchmark: Running benchmarks... PkgBenchmark: using benchmark tuning data in /home/tom/.julia/dev/TimeZones/benchmark/tune.json Benchmarking 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:47 ```

Benchmark Report for /home/tom/.julia/dev/TimeZones

Job Properties

  • Time of benchmarks:
    • Target: 30 Nov 2023 - 21:39
    • Baseline: 30 Nov 2023 - 21:40
  • Package commits:
    • Target: 2b0b76
    • Baseline: f9d0dc
  • Julia commits:
    • Target: 8e5136
    • Baseline: 8e5136
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: None
    • Baseline: None

Results

A ratio greater than 1.0 denotes a possible regression (marked with :x:), while a ratio less than 1.0 denotes a possible improvement (marked with :white_check_mark:). Only significant results - results that indicate possible regressions or improvements - are shown below (thus, an empty table means that all benchmark results remained invariant between builds).

ID time ratio memory ratio
["ZonedDateTime", "local", "ambiguous"] 1.08 (5%) :x: 1.00 (1%)
["ZonedDateTime", "local", "standard"] 0.92 (5%) :white_check_mark: 1.00 (1%)
["ZonedDateTime", "range", "VariableTimeZone/DatePeriod"] 1.09 (5%) :x: 1.00 (1%)
["interpret", "local", "ambiguous"] 0.92 (5%) :white_check_mark: 1.00 (1%)
["interpret", "local", "non-existent"] 0.91 (5%) :white_check_mark: 1.00 (1%)
["interpret", "local", "standard"] 0.94 (5%) :white_check_mark: 1.00 (1%)
["interpret", "utc"] 0.90 (5%) :white_check_mark: 1.00 (1%)
["parse", "ISOZonedDateTimeFormat"] 1.00 (5%) 0.94 (1%) :white_check_mark:
["parse", "issue#25"] 1.01 (5%) 1.05 (1%) :x:
["transition_range", "utc"] 1.88 (5%) :x: 1.00 (1%)
["tryparsenext_tz", "America/Argentina/ComodRivadavia"] 0.95 (5%) :white_check_mark: 1.00 (1%)
["tryparsenext_tz", "UTC"] 0.94 (5%) :white_check_mark: 1.00 (1%)
["tz_data", "parse_components"] 0.98 (5%) 1.07 (1%) :x:

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["ZonedDateTime"]
  • ["ZonedDateTime", "local"]
  • ["ZonedDateTime", "range"]
  • ["arithmetic"]
  • ["arithmetic", "broadcast"]
  • ["interpret", "local"]
  • ["interpret"]
  • ["parse"]
  • ["transition_range", "local"]
  • ["transition_range"]
  • ["tryparsenext_fixedtz"]
  • ["tryparsenext_tz"]
  • ["tz_data"]

Julia versioninfo

Target

Julia Version 1.9.4
Commit 8e5136fa297 (2023-11-14 08:46 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      "Fedora release 39 (Thirty Nine)"
  uname: Linux 6.6.2-201.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Nov 22 21:31:42 UTC 2023 x86_64 unknown
  CPU: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz:
                 speed         user         nice          sys         idle          irq
       #1-12  4298 MHz     290496 s       1686 s      90879 s     443666 s      20652 s
  Memory: 31.26819610595703 GB (19133.734375 MB free)
  Uptime: 46609.8 sec
  Load Avg:  1.26  0.93  0.72
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake)
  Threads: 1 on 12 virtual cores

Baseline

Julia Version 1.9.4
Commit 8e5136fa297 (2023-11-14 08:46 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
      "Fedora release 39 (Thirty Nine)"
  uname: Linux 6.6.2-201.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Nov 22 21:31:42 UTC 2023 x86_64 unknown
  CPU: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz:
                 speed         user         nice          sys         idle          irq
       #1-12  4497 MHz     291253 s       1686 s      90977 s     450010 s      20669 s
  Memory: 31.26819610595703 GB (19001.28125 MB free)
  Uptime: 46670.1 sec
  Load Avg:  1.58  1.06  0.77
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake)
  Threads: 1 on 12 virtual cores

For good measure I did a second run, and stuff moved around quite a bit. So I'm not reading too much into any of these numbers:

ID time ratio memory ratio
["ZonedDateTime", "local", "ambiguous"] 0.91 (5%) :white_check_mark: 1.00 (1%)
["interpret", "local", "ambiguous"] 1.29 (5%) :x: 1.00 (1%)
["interpret", "local", "non-existent"] 1.37 (5%) :x: 1.00 (1%)
["interpret", "local", "standard"] 1.21 (5%) :x: 1.00 (1%)
["parse", "ISOZonedDateTimeFormat"] 1.01 (5%) 0.94 (1%) :white_check_mark:
["parse", "issue#25"] 0.97 (5%) 1.05 (1%) :x:
["transition_range", "utc"] 0.75 (5%) :white_check_mark: 1.00 (1%)
["tryparsenext_fixedtz", "UTC"] 1.09 (5%) :x: 1.00 (1%)
["tz_data", "parse_components"] 0.99 (5%) 1.07 (1%) :x:

tpgillam avatar Nov 30 '23 21:11 tpgillam

Thanks both for the discussions so far! @omus are you happy to move forward with this, or are there are other thoughts/concerns we should discuss?

tpgillam avatar Dec 09 '23 12:12 tpgillam

Bump! @omus please could you merge this and make a new release?

nickrobinson251 avatar Mar 25 '24 18:03 nickrobinson251

Baseline timings for my machine. I'll need to resolve the conflict yet.

Julia 1.10.0

Current master (ac06a91)

julia> @btime TimeZone("America/Winnipeg");
  41.246 ns (2 allocations: 64 bytes)

julia> @btime TimeZone("UTC");
  40.868 ns (2 allocations: 64 bytes)

This PR (2b0b76e)

julia> @btime TimeZone("UTC");
  17.994 ns (0 allocations: 0 bytes)

julia> @btime TimeZone("America/Winnipeg");
  28.475 ns (2 allocations: 96 bytes)

omus avatar May 23 '24 03:05 omus

Codecov Report

Attention: Patch coverage is 97.22222% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 92.73%. Comparing base (2bc8f50) to head (1237173). Report is 7 commits behind head on master.

Files Patch % Lines
src/types/timezone.jl 97.22% 1 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #451      +/-   ##
==========================================
- Coverage   92.79%   92.73%   -0.07%     
==========================================
  Files          39       39              
  Lines        1818     1844      +26     
==========================================
+ Hits         1687     1710      +23     
- Misses        131      134       +3     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar May 23 '24 03:05 codecov-commenter

This PR 511d2cc

julia> @time_imports using TimeZones
      0.4 ms  Scratch
     10.2 ms  Preferences
      0.4 ms  PrecompileTools
               ┌ 0.0 ms Parsers.__init__()
     45.3 ms  Parsers 40.31% compilation time
      4.2 ms  InlineStrings
      0.4 ms  TZJData
      0.5 ms  Compat
      0.2 ms  Compat → CompatLinearAlgebraExt
      0.4 ms  ExprTools
      0.6 ms  Mocking
               ┌ 1.1 ms TimeZones.TZData.__init__()
               ├ 0.0 ms TimeZones.__init__()
     22.6 ms  TimeZones

julia> using BenchmarkTools

julia> @btime TimeZone("UTC");
  18.203 ns (0 allocations: 0 bytes)

julia> @btime TimeZone("America/Winnipeg");
  29.104 ns (2 allocations: 96 bytes)

julia> @btime istimezone("Europe/Warsaw");
  76.418 ns (1 allocation: 48 bytes)

I slight increase in allocated bytes with istimezone from this: https://github.com/JuliaTime/TimeZones.jl/pull/457#issuecomment-2118428992

omus avatar May 23 '24 03:05 omus

The cache code is definitely becoming unwieldy. I've created a TimeZoneCache struct which doesn't affect the performance:

This PR 67f2c3f

julia> @time_imports using TimeZones
      0.4 ms  Scratch
      9.1 ms  Preferences
      0.4 ms  PrecompileTools
               ┌ 0.0 ms Parsers.__init__()
     46.7 ms  Parsers 38.23% compilation time
      4.1 ms  InlineStrings
      0.3 ms  TZJData
      0.4 ms  Compat
      0.2 ms  Compat → CompatLinearAlgebraExt
      0.4 ms  ExprTools
      0.6 ms  Mocking
               ┌ 1.0 ms TimeZones.TZData.__init__()
               ├ 0.0 ms TimeZones.__init__()
     22.9 ms  TimeZones

julia> using BenchmarkTools

julia> @btime TimeZone("UTC");
  18.597 ns (0 allocations: 0 bytes)

julia> @btime TimeZone("America/Winnipeg");
  28.894 ns (2 allocations: 96 bytes)

julia> @btime istimezone("Europe/Warsaw");
  74.802 ns (1 allocation: 48 bytes)

omus avatar May 23 '24 04:05 omus

These changes are also a step in the right direction for reducing the initial cache loading time

This PR 67f2c3f

julia> @btime TimeZones._reload_tz_cache(TimeZones._COMPILED_DIR[]);
  20.795 ms (313574 allocations: 12.52 MiB)

Current master ac06a91

julia> @btime TimeZones._reload_tz_cache(TimeZones._COMPILED_DIR[]);
  22.808 ms (313621 allocations: 12.52 MiB)

omus avatar May 23 '24 13:05 omus

I'm planning on rolling some other changes in before making a release here. I expect TimeZones.jl 1.17 should be out by 2024-05-27 (Monday).

omus avatar May 23 '24 21:05 omus