Omega.jl icon indicating copy to clipboard operation
Omega.jl copied to clipboard

Why is this slow?

Open zenna opened this issue 7 years ago • 11 comments

julia> n = normal(0,1)

julia> @time mean(n, 10000000)
  7.962374 seconds (281.66 M allocations: 9.711 GiB, 9.67% gc time)
-0.00016734630189156838

julia> @time mean(randn(10000000))
  0.091020 seconds (7 allocations: 76.294 MiB, 3.16% gc time)
0.00016417361940622018

zenna avatar Apr 10 '18 00:04 zenna

@time mean([rand(Distributions.Normal(0,1)) for i = 1:10000000])
.204776 seconds (8.96 k allocations: 76.766 MiB, 32.13% gc time

zenna avatar Apr 10 '18 22:04 zenna

julia> @benchmark randn()
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     4.542 ns (0.00% GC)
  median time:      5.307 ns (0.00% GC)
  mean time:        5.585 ns (0.00% GC)
  maximum time:     29.316 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

julia> using Mu

julia> @benchmark rand(Mu.normal(0.0, 1.0))
BenchmarkTools.Trial: 
  memory estimate:  3.89 KiB
  allocs estimate:  34
  --------------
  minimum time:     15.589 μs (0.00% GC)
  median time:      16.151 μs (0.00% GC)
  mean time:        17.636 μs (0.80% GC)
  maximum time:     1.464 ms (96.68% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> using Distributions

julia> @benchmark rand(Normal(0.1))
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     7.919 ns (0.00% GC)
  median time:      8.824 ns (0.00% GC)
  mean time:        8.972 ns (0.00% GC)
  maximum time:     44.803 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     999

julia> 

zenna avatar Apr 12 '18 21:04 zenna

5,000x overhead

zenna avatar Apr 12 '18 21:04 zenna

this is weird. My times for Mu.normal are 1/10th of yours, but everything is 2x slower

julia> @benchmark rand(Mu.normal(0.0, 1.0))
BenchmarkTools.Trial: 
  memory estimate:  3.88 KiB
  allocs estimate:  33
  --------------
  minimum time:     1.242 μs (0.00% GC)
  median time:      1.521 μs (0.00% GC)
  mean time:        2.096 μs (21.77% GC)
  maximum time:     355.423 μs (96.01% GC)
  --------------
  samples:          10000
  evals/sample:     10

jburroni avatar Apr 13 '18 02:04 jburroni

julia> @benchmark Mu.rand(normal(0.0, 1.0))
BenchmarkTools.Trial:
 memory estimate:  3.88 KiB
 allocs estimate:  33
 --------------
 minimum time:     671.394 ns (0.00% GC)
 median time:      735.423 ns (0.00% GC)
 mean time:        988.936 ns (20.00% GC)
 maximum time:     11.926 μs (89.64% GC)
 --------------
 samples:          10000

zenna avatar Apr 13 '18 02:04 zenna

julia> @benchmark rand(x, Mu.DiffOmega) setup=(x=normal(0.0, 1.0))
BenchmarkTools.Trial: 
  memory estimate:  848 bytes
  allocs estimate:  11
  --------------
  minimum time:     450.227 ns (0.00% GC)
  median time:      459.798 ns (0.00% GC)
  mean time:        535.215 ns (10.85% GC)
  maximum time:     9.348 μs (93.61% GC)
  --------------
  samples:          10000
  evals/sample:     198

julia> @benchmark rand(x) setup=(x=norm)
norm             normal            normalize         normalize!        normalize_string  normpath
julia> @benchmark rand(x) setup=(x=Distributions.Normal(0, 1))
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     5.880 ns (0.00% GC)
  median time:      6.593 ns (0.00% GC)
  mean time:        6.698 ns (0.00% GC)
  maximum time:     27.988 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

julia> quantilerand(x) = quantile(x, rand())
quantilerand (generic function with 1 method)

julia> @benchmark quantilerand(x) setup=(x=Distributions.Normal(0, 1))
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     18.331 ns (0.00% GC)
  median time:      19.775 ns (0.00% GC)
  mean time:        20.052 ns (0.00% GC)
  maximum time:     70.254 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     998

zenna avatar Apr 14 '18 19:04 zenna

I think this is the best we'll get. Most of remaining overhead is dictionary creation

julia> @benchmark rand(x, Mu.DiffOmega) setup=(x=normal(0.0, 1.0))
BenchmarkTools.Trial: 
  memory estimate:  816 bytes
  allocs estimate:  9
  --------------
  minimum time:     173.438 ns (0.00% GC)
  median time:      182.110 ns (0.00% GC)
  mean time:        271.396 ns (29.90% GC)
  maximum time:     4.111 μs (92.86% GC)
  --------------
  samples:          10000
  evals/sample:     737

zenna avatar Apr 16 '18 17:04 zenna

julia> @benchmark rand(x) setup=(x=normal(0.0, 1.0))
BenchmarkTools.Trial: 
  memory estimate:  656 bytes
  allocs estimate:  6
  --------------
  minimum time:     124.614 ns (0.00% GC)
  median time:      131.486 ns (0.00% GC)
  mean time:        200.081 ns (30.25% GC)
  maximum time:     3.154 μs (94.68% GC)
  --------------
  samples:          10000
  evals/sample:     898

zenna avatar Apr 18 '18 23:04 zenna

0.7 is a little faster

With simple omega

julia> @benchmark rand(x) setup=(x=normal(0.0, 1.0))
BenchmarkTools.Trial: 
  memory estimate:  848 bytes
  allocs estimate:  6
  --------------
  minimum time:     119.229 ns (0.00% GC)
  median time:      130.683 ns (0.00% GC)
  mean time:        197.373 ns (28.16% GC)
  maximum time:     56.461 μs (99.62% GC)
  --------------
  samples:          10000
  evals/sample:     927

Diff Omega

julia> @benchmark rand(x) setup=(x=normal(0.0, 1.0))
BenchmarkTools.Trial: 
  memory estimate:  704 bytes
  allocs estimate:  7
  --------------
  minimum time:     93.745 ns (0.00% GC)
  median time:      99.638 ns (0.00% GC)
  mean time:        157.320 ns (34.25% GC)
  maximum time:     52.377 μs (99.77% GC)
  --------------
  samples:          10000
  evals/sample:     957

zenna avatar Jul 22 '18 12:07 zenna

Big over increase below: Need to profile and fix.

julia> @benchmark rand(x) setup=(x=normal(0.0, 1.0))
BenchmarkTools.Trial: 
  memory estimate:  1.22 KiB
  allocs estimate:  16
  --------------
  minimum time:     752.370 ns (0.00% GC)
  median time:      899.571 ns (0.00% GC)
  mean time:        1.156 μs (21.26% GC)
  maximum time:     570.116 μs (99.78% GC)
  --------------
  samples:          10000
  evals/sample:     127

zenna avatar Aug 17 '18 13:08 zenna

With LinearΩ (big repgression)

julia> @benchmark rand(x) setup=(x=normal(0.0, 1.0))
BenchmarkTools.Trial: 
  memory estimate:  2.56 KiB
  allocs estimate:  45
  --------------
  minimum time:     2.945 μs (0.00% GC)
  median time:      3.085 μs (0.00% GC)
  mean time:        3.500 μs (7.50% GC)
  maximum time:     313.384 μs (98.39% GC)
  --------------
  samples:          10000
  evals/sample:     8

Looking at the profile

  • Suprising amount of time creating named tuples in trackerr (10%) and in callbacks (12%)
  • tagging with soft err (10%), mostly due to expense of merge
  • Hashing of Vector{Int} slow
  • creating the dictionary is expensive

Sols

  • Move to linkedlist
  • dont create named tuple in applywoerr
  • use wrapper instead of ref

zenna avatar Feb 07 '19 16:02 zenna