[ISSUE] Lots of 45mb Garbage Collections
Describe the bug 🐞
After enabling GC logging, there are lots of 45mb collections in v1.9 and similar but fewer collections in v1.10. Is there a hidden problem that causing this? Is this expected? See: https://github.com/SciML/DiffEqFlux.jl/actions/runs/6963558046/job/18949364949?pr=876#step:6:934
Expected behavior
If this many collections are needed, I expect fewer but bigger collections.
Additional context
I enabled verbose logging in this PR to show it in CI, but I don't mean to suggest merging this.
Allocations in what code? That needs an MWE.
It's happening with tests, and you can see it in CI logs. I will prepare a simple code to help to find the problem.
Just importing makes it happen:
julia> GC.enable_logging(true)
julia> using DiffEqFlux
GC: pause 57.38ms. collected 41.796171MB. incr
GC: pause 9.68ms. collected 45.262600MB. incr
GC: pause 16.19ms. collected 43.328717MB. incr
GC: pause 12.58ms. collected 45.140373MB. incr
GC: pause 10.19ms. collected 45.758649MB. incr
GC: pause 10.85ms. collected 45.298368MB. incr
GC: pause 9.94ms. collected 45.552704MB. incr
GC: pause 12.44ms. collected 45.222881MB. incr
GC: pause 31.45ms. collected 84.019482MB. incr
GC: pause 26.65ms. collected 44.294026MB. incr
GC: pause 26.80ms. collected 44.012313MB. incr
GC: pause 12.45ms. collected 45.630832MB. incr
GC: pause 10.44ms. collected 45.807120MB. incr
GC: pause 31.44ms. collected 39.978864MB. incr
GC: pause 23.42ms. collected 43.454560MB. incr
GC: pause 63.76ms. collected 34.884944MB. incr
GC: pause 35.76ms. collected 42.568185MB. incr
GC: pause 51.19ms. collected 40.909058MB. incr
GC: pause 42.47ms. collected 44.478098MB. incr
GC: pause 26.43ms. collected 45.805456MB. incr
GC: pause 12.93ms. collected 45.840640MB. incr
GC: pause 5.05ms. collected 45.872752MB. incr
GC: pause 9.41ms. collected 45.870880MB. incr
GC: pause 5.78ms. collected 45.791472MB. incr
GC: pause 4.74ms. collected 45.844864MB. incr
GC: pause 5.22ms. collected 45.872752MB. incr
GC: pause 6.30ms. collected 45.871264MB. incr
GC: pause 5.16ms. collected 45.407056MB. incr
GC: pause 4.64ms. collected 45.857808MB. incr
GC: pause 7.93ms. collected 45.858208MB. incr
GC: pause 5.90ms. collected 45.855696MB. incr
GC: pause 3.93ms. collected 45.859072MB. incr
GC: pause 4.60ms. collected 45.859296MB. incr
GC: pause 4.16ms. collected 45.856080MB. incr
GC: pause 4.24ms. collected 45.860592MB. incr
GC: pause 4.16ms. collected 45.858048MB. incr
GC: pause 4.22ms. collected 45.858112MB. incr
GC: pause 4.02ms. collected 45.414160MB. incr
GC: pause 21.28ms. collected 43.553536MB. incr
GC: pause 32.83ms. collected 43.595696MB. incr
GC: pause 36.25ms. collected 43.491072MB. incr
GC: pause 30.38ms. collected 42.905615MB. incr
GC: pause 21.30ms. collected 43.745812MB. incr
GC: pause 25.11ms. collected 43.699360MB. incr
GC: pause 18.86ms. collected 43.981161MB. incr
GC: pause 27.35ms. collected 43.179738MB. incr
GC: pause 20.64ms. collected 42.845784MB. incr
GC: pause 22.45ms. collected 39.761012MB. incr
GC: pause 23.30ms. collected 38.265232MB. incr
GC: pause 32.65ms. collected 38.866064MB. incr
GC: pause 17.16ms. collected 50.461456MB. incr
GC: pause 19.82ms. collected 39.387896MB. incr
GC: pause 30.84ms. collected 42.279936MB. incr
GC: pause 20.69ms. collected 41.879768MB. incr
GC: pause 21.25ms. collected 40.108576MB. incr
GC: pause 28.18ms. collected 46.541039MB. incr
GC: pause 24.81ms. collected 39.836282MB. incr
GC: pause 26.55ms. collected 42.104621MB. incr
GC: pause 25.34ms. collected 41.990615MB. incr
GC: pause 30.42ms. collected 45.752040MB. incr
GC: pause 28.23ms. collected 82.522075MB. incr
GC: pause 30.69ms. collected 44.333632MB. incr
GC: pause 38.66ms. collected 43.105015MB. incr
GC: pause 36.82ms. collected 41.702379MB. incr
GC: pause 35.28ms. collected 37.977064MB. incr
GC: pause 29.94ms. collected 41.555611MB. incr
GC: pause 36.09ms. collected 42.575280MB. incr
GC: pause 24.93ms. collected 40.031064MB. incr
GC: pause 31.66ms. collected 83.571356MB. incr
GC: pause 34.54ms. collected 45.676024MB. incr
GC: pause 24.10ms. collected 43.524004MB. incr
GC: pause 22.01ms. collected 43.989228MB. incr
GC: pause 28.51ms. collected 83.959226MB. incr
GC: pause 26.63ms. collected 48.136237MB. incr
GC: pause 36.85ms. collected 43.874495MB. incr
GC: pause 21.87ms. collected 45.847024MB. incr
GC: pause 18.47ms. collected 45.884080MB. incr
After that:
julia> using DiffEqFlux, Zygote, Lux, OrdinaryDiffEq
julia> nn = Dense(10 => 10, tanh)
GC: pause 25.03ms. collected 43.252791MB. incr
Dense(10 => 10, tanh_fast) # 110 parameters
julia> r = rand(Float32, 10, 3)
10×3 Matrix{Float32}:
0.27089 0.646868 0.883482
0.00319999 0.67531 0.13434
0.000828087 0.193554 0.0964574
0.630383 0.611936 0.527316
0.565446 0.397655 0.950981
0.715725 0.971159 0.93148
0.872359 0.550618 0.325878
0.908961 0.605464 0.479908
0.608819 0.762887 0.66519
0.726801 0.6027 0.812821
julia> ffjord = FFJORD(nn, (0f0, 1f0), 10, Tsit5())
FFJORD(
model = Dense(10 => 10, tanh_fast), # 110 parameters
) # Total: 110 parameters,
# plus 0 states.
julia> using Random
julia> ps, st = Lux.setup(Random.default_rng(), ffjord)
GC: pause 21.45ms. collected 43.931268MB. incr
GC: pause 20.41ms. collected 45.875296MB. incr
GC: pause 16.84ms. collected 45.875920MB. incr
GC: pause 7.58ms. collected 45.874784MB. incr
GC: pause 7.85ms. collected 45.875600MB. incr
GC: pause 7.85ms. collected 45.874240MB. incr
GC: pause 7.88ms. collected 45.875440MB. incr
GC: pause 7.60ms. collected 45.875056MB. incr
GC: pause 7.69ms. collected 45.875024MB. incr
GC: pause 7.65ms. collected 45.876240MB. incr
GC: pause 7.57ms. collected 45.874880MB. incr
((weight = Float32[-0.52843523 -0.5061275 … -0.052238554 -0.4398058; 0.06725158 0.20361221 … -0.37902012 0.48752564; … ; -0.42431933 0.15210965 … -0.5166999 -0.12965634; -0.29427272 -0.031881046 … -0.36461896 -0.14274831], bias = Float32[0.0; 0.0; … ; 0.0; 0.0;;]), (model = NamedTuple(), regularize = false, monte_carlo = true))
julia> ffjord(r, ps, st)
GC: pause 23.74ms. collected 86.637212MB. incr
GC: pause 27.28ms. collected 42.793098MB. incr
GC: pause 26.75ms. collected 43.053016MB. incr
GC: pause 24.33ms. collected 43.600941MB. incr
GC: pause 23.76ms. collected 43.163138MB. incr
GC: pause 23.59ms. collected 44.002933MB. incr
GC: pause 21.49ms. collected 46.052469MB. incr
GC: pause 23.94ms. collected 45.862518MB. incr
GC: pause 22.96ms. collected 45.687431MB. incr
GC: pause 23.96ms. collected 45.411270MB. incr
GC: pause 17.55ms. collected 45.775945MB. incr
GC: pause 20.82ms. collected 43.331690MB. incr
GC: pause 32.66ms. collected 40.777536MB. incr
GC: pause 24.50ms. collected 44.020420MB. incr
GC: pause 22.44ms. collected 45.721750MB. incr
GC: pause 16.58ms. collected 45.875952MB. incr
GC: pause 16.59ms. collected 45.875008MB. incr
GC: pause 7.32ms. collected 45.875264MB. incr
GC: pause 7.70ms. collected 45.874736MB. incr
GC: pause 7.01ms. collected 45.875920MB. incr
GC: pause 10.08ms. collected 45.875168MB. incr
GC: pause 10.09ms. collected 45.875200MB. incr
GC: pause 8.92ms. collected 45.874720MB. incr
GC: pause 7.44ms. collected 45.873664MB. incr
GC: pause 25.28ms. collected 44.724036MB. incr
GC: pause 21.27ms. collected 44.455112MB. incr
GC: pause 25.41ms. collected 43.457275MB. incr
GC: pause 26.37ms. collected 44.266443MB. incr
GC: pause 20.24ms. collected 45.876784MB. incr
GC: pause 15.62ms. collected 45.873552MB. incr
GC: pause 7.66ms. collected 45.877296MB. incr
GC: pause 7.68ms. collected 45.875168MB. incr
GC: pause 7.17ms. collected 45.874368MB. incr
GC: pause 7.91ms. collected 45.875392MB. incr
GC: pause 7.14ms. collected 45.875104MB. incr
GC: pause 10.07ms. collected 45.874784MB. incr
GC: pause 8.85ms. collected 45.876496MB. incr
GC: pause 7.94ms. collected 45.873456MB. incr
GC: pause 16.83ms. collected 45.553760MB. incr
GC: pause 13.78ms. collected 45.875056MB. incr
GC: pause 17.09ms. collected 45.874688MB. incr
GC: pause 8.63ms. collected 45.873760MB. incr
GC: pause 7.54ms. collected 45.877584MB. incr
GC: pause 7.23ms. collected 45.874896MB. incr
GC: pause 8.97ms. collected 45.873568MB. incr
GC: pause 7.04ms. collected 45.877296MB. incr
GC: pause 8.68ms. collected 45.875248MB. incr
GC: pause 9.18ms. collected 45.873776MB. incr
GC: pause 12.07ms. collected 45.746576MB. incr
GC: pause 18.63ms. collected 45.853792MB. incr
GC: pause 21.60ms. collected 43.227097MB. incr
GC: pause 82.36ms. collected 84.026282MB. incr
GC: pause 29.45ms. collected 45.823282MB. incr
GC: pause 22.44ms. collected 45.874880MB. incr
GC: pause 14.71ms. collected 45.875152MB. incr
GC: pause 7.42ms. collected 45.876064MB. incr
GC: pause 7.73ms. collected 45.874912MB. incr
GC: pause 7.43ms. collected 45.874928MB. incr
GC: pause 7.61ms. collected 45.875904MB. incr
GC: pause 9.61ms. collected 45.874784MB. incr
GC: pause 9.30ms. collected 45.875040MB. incr
GC: pause 7.96ms. collected 45.875264MB. incr
GC: pause 7.82ms. collected 45.875184MB. incr
GC: pause 16.85ms. collected 45.340669MB. incr
GC: pause 19.87ms. collected 44.938351MB. incr
GC: pause 23.68ms. collected 43.271315MB. incr
GC: pause 25.31ms. collected 44.622375MB. incr
GC: pause 22.15ms. collected 45.820919MB. incr
GC: pause 18.42ms. collected 45.875232MB. incr
GC: pause 12.79ms. collected 45.874544MB. incr
GC: pause 8.04ms. collected 45.875200MB. incr
GC: pause 7.72ms. collected 45.875248MB. incr
GC: pause 7.79ms. collected 45.875760MB. incr
GC: pause 7.70ms. collected 45.875376MB. incr
GC: pause 8.01ms. collected 45.875312MB. incr
GC: pause 7.87ms. collected 45.874208MB. incr
GC: pause 7.65ms. collected 45.876144MB. incr
GC: pause 13.28ms. collected 45.631024MB. incr
GC: pause 13.92ms. collected 45.873808MB. incr
GC: pause 13.02ms. collected 45.875776MB. incr
GC: pause 14.18ms. collected 45.833536MB. incr
GC: pause 19.38ms. collected 45.463792MB. incr
GC: pause 18.82ms. collected 45.875600MB. incr
GC: pause 17.95ms. collected 45.875600MB. incr
GC: pause 12.55ms. collected 45.875536MB. incr
GC: pause 9.65ms. collected 45.873888MB. incr
GC: pause 9.63ms. collected 45.875728MB. incr
GC: pause 9.85ms. collected 45.875920MB. incr
GC: pause 9.95ms. collected 45.874192MB. incr
GC: pause 9.89ms. collected 45.875968MB. incr
GC: pause 9.94ms. collected 45.874704MB. incr
GC: pause 9.92ms. collected 45.876064MB. incr
GC: pause 9.80ms. collected 45.875264MB. incr
GC: pause 9.93ms. collected 45.875616MB. incr
GC: pause 10.01ms. collected 45.873664MB. incr
GC: pause 10.00ms. collected 45.875936MB. incr
GC: pause 9.71ms. collected 45.876320MB. incr
GC: pause 10.73ms. collected 45.873584MB. incr
GC: pause 9.83ms. collected 45.878096MB. incr
(GC: pause 17.17ms. collected 45.693310MB. incr
(Float32[-13.99764 -12.953953 -13.197707], Float32[1.0408309 0.97217447 0.9595545], Float32[1.0408309 0.97217447 0.9595545]), (model = NamedTuple(), regularize = false, monte_carlo = true))
What does the allocation profiler say?
I can't reproduce it now.