StaticArrays.jl icon indicating copy to clipboard operation
StaticArrays.jl copied to clipboard

Info request: performance of == vs map(==)

Open alhirzel opened this issue 3 years ago • 9 comments

Wondering if the following performance is expected?

#               _
#   _       _ _(_)_     |  Documentation: https://docs.julialang.org
#  (_)     | (_) (_)    |
#   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
#  | | | | | | |/ _` |  |
#  | | |_| | | | (_| |  |  Version 1.6.1 (2021-04-23)
# _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
#|__/                   |
#
julia> using StaticArrays

julia> @time SVector(1:100...) == SVector(1:100...)
  0.347808 seconds (749.39 k allocations: 48.644 MiB, 15.53% gc time, 99.78% compilation time)
true

julia> @time map(==, [SVector(1:100...)], [SVector(1:100...)]);
 12.972637 seconds (409.54 k allocations: 24.336 MiB, 100.00% compilation time)

Assuming this is expected, wonder if anyone would be willing to shed light on what's happening with map to cause this?

alhirzel avatar Dec 13 '21 19:12 alhirzel

Note:

100.00% compilation time

fredrikekre avatar Dec 13 '21 20:12 fredrikekre

Yep, good eye! I don't understand what additional function needs to get compiled, though. I would think since map(==, [a], [b]) is (at least conceptually :stuck_out_tongue_winking_eye: ) equal to [a == b], there would not be this discrepancy.

alhirzel avatar Dec 13 '21 23:12 alhirzel

You can try SnoopCompile to see where the difference comes from:

using SnoopCompile
SnoopCompile.@snoopc "/tmp/compiles_a.log" begin
    using StaticArrays
    map(==, [SVector(1:100...)], [SVector(1:100...)]);
end
SnoopCompile.@snoopc "/tmp/compiles_b.log" begin
    using StaticArrays
    SVector(1:100...) == SVector(1:100...)
end

mateuszbaran avatar Dec 14 '21 07:12 mateuszbaran

Use @snoopi_deeprather than @snoopc.

timholy avatar Dec 14 '21 09:12 timholy

Tried @snoopi_deep for the map-variant:

julia> using StaticArrays, SnoopCompile

julia> v = [SVector(1:100...)]
1-element Vector{SVector{100, Int64}}:
 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  91, 92, 93, 94, 95, 96, 97, 98, 99, 100]

julia> tinf = @snoopi_deep map(==, v, v);

julia> flatten(tinf)
344-element Vector{SnoopCompileCore.InferenceTiming}:
 InferenceTiming: 0.000020/0.000020 on convert(::Type{Int64}, 0::Int64)
 InferenceTiming: 0.000025/0.000025 on Base._counttuple(::Type{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}})
 InferenceTiming: 0.000026/0.000026 on Base.argtail(::Vector{SVector{100, Int64}}, ::Vector{SVector{100, Int64}})
 InferenceTiming: 0.000027/0.000027 on Base.isdone(::Vector{SVector{100, Int64}})
 InferenceTiming: 0.000027/0.000027 on ndims(::Vector{Base.HasShape{1}})
 InferenceTiming: 0.000028/0.000028 on convert(::Type{Int64}, 1::Int64)
 InferenceTiming: 0.000028/0.000028 on Base.argtail(((SOneTo(100), 1),)::Tuple{Tuple{SOneTo{100}, Int64}}, (((SOneTo(100), 1),),)::Tuple{Tuple{SOneTo{100}, Int64}})
 InferenceTiming: 0.000029/0.000029 on Base.isdone(::SVector{100, Int64}, ::Tuple{SOneTo{100}, Int64})
 InferenceTiming: 0.000029/0.000029 on getproperty(Core.Compiler::Module, return_type::Symbol)
 InferenceTiming: 0.000029/0.000029 on Base.Iterators.and_iteratorsize(Base.HasShape{1}()::Base.HasShape{1}, Base.HasShape{1}()::Base.HasShape{1})
 InferenceTiming: 0.000030/0.000030 on convert(::Type{Base.HasShape{1}}, Base.HasShape{1}()::Base.HasShape{1})
 InferenceTiming: 0.000030/0.000030 on +(2::Int64, 1::Int64)
 InferenceTiming: 0.000030/0.000030 on (::Base.Iterators.var"#5#6")(::Vector{SVector{100, Int64}})
 InferenceTiming: 0.000030/0.000030 on Base.argtail(2::Int64, (2,)::Int64)
 InferenceTiming: 0.000031/0.000031 on (::Base.Iterators.var"#5#6")(::SVector{100, Int64})
 InferenceTiming: 0.000031/0.000031 on convert(::Type{Base.var"#180#181"{Base.Iterators.var"#7#8"{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}}}, #180::Base.var"#180#181"{Base.Iterators.var"#7#8"{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}})
 InferenceTiming: 0.000031/0.000031 on getproperty(::UnitRange{Int64}, start::Symbol)
 InferenceTiming: 0.000031/0.000031 on Base.isdone(::Vector{SVector{100, Int64}}, ::Int64)
 InferenceTiming: 0.000033/0.000033 on getproperty(::Base.Generator{UnitRange{Int64}, Base.var"#180#181"{Base.Iterators.var"#7#8"{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}}}, f::Symbol)
 InferenceTiming: 0.000033/0.000033 on getproperty(::Base.Generator{UnitRange{Int64}, Base.var"#180#181"{Base.Iterators.var"#7#8"{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}}}, f::Symbol)
 InferenceTiming: 0.000033/0.000033 on Base.argtail(::Vector{SVector{100, Int64}})
 InferenceTiming: 0.000033/0.000033 on Base.argtail(::Tuple{Tuple{SOneTo{100}, Int64}})
 InferenceTiming: 0.000033/0.000033 on getproperty(::Base.Iterators.Zip{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}, is::Symbol)
 ⋮
 InferenceTiming: 0.000611/0.000972 on Base.Iterators._zip_iterate_some(::Tuple{SVector{100, Int64}, SVector{100, Int64}}, ::Tuple{Tuple{Tuple{SOneTo{100}, Int64}}, Tuple{Tuple{SOneTo{100}, Int64}}}, (missing, missing)::Tuple{Missing, Missing}, missing::Missing)
 InferenceTiming: 0.000616/0.002379 on Base.collect_to_with_first!(::Vector{Base.HasShape{1}}, Base.HasShape{1}()::Base.HasShape{1}, ::Base.Generator{UnitRange{Int64}, Base.var"#180#181"{Base.Iterators.var"#7#8"{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}}}, ::Int64)
 InferenceTiming: 0.000655/0.001437 on Base.Generator(::Base.var"#4#5", ::Base.Iterators.Zip{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}})
 InferenceTiming: 0.000672/0.002557 on Base.Iterators._zip_iterate_all(::Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}, ::Tuple{Tuple{Int64}, Tuple{Int64}})
 InferenceTiming: 0.000683/0.003044 on Base.Iterators._zip_iterate_all(::Tuple{SVector{100, Int64}, SVector{100, Int64}}, ::Tuple{Tuple{Tuple{SOneTo{100}, Int64}}, Tuple{Tuple{SOneTo{100}, Int64}}})
 InferenceTiming: 0.000710/0.000781 on (::Type{Base.Generator{Base.Iterators.Zip{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}, _A}} where _A)(::Base.var"#4#5", ::Base.Iterators.Zip{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}})
 InferenceTiming: 0.000754/0.005946 on Base.Iterators._zip_iterate_all(::Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}, ((), ())::Tuple{Tuple{}, Tuple{}})
 InferenceTiming: 0.000876/0.012894 on Base._ntuple(#7::Base.Iterators.var"#7#8"{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}, ::Int64)
 InferenceTiming: 0.000877/0.007580 on iterate(::Base.Generator{Base.Iterators.Zip{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}})
 InferenceTiming: 0.000901/0.003896 on iterate(::Base.Generator{Base.Iterators.Zip{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}}, ::Tuple{Int64, Int64})
 InferenceTiming: 0.000968/0.000968 on iterate(::Base.Generator{Base.Iterators.Zip{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}, Base.var"#4#5"{typeof(==)}}, ::Tuple{Int64, Int64})
 InferenceTiming: 0.000984/0.004880 on Base.collect_to!(::Vector, ::Base.Generator{Base.Iterators.Zip{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}}, ::Int64, ::Tuple{Int64, Int64})
 InferenceTiming: 0.001033/0.019158 on iterate(::Base.Generator{Base.Iterators.Zip{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}, Base.var"#4#5"{typeof(==)}})
 InferenceTiming: 0.001042/0.003677 on collect(::Base.Generator{UnitRange{Int64}, Base.var"#180#181"{Base.Iterators.var"#7#8"{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}}})
 InferenceTiming: 0.001164/0.001164 on Base.collect_to!(::Vector{Bool}, ::Base.Generator{Base.Iterators.Zip{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}, Base.var"#4#5"{typeof(==)}}, ::Int64, ::Tuple{Int64, Int64})
 InferenceTiming: 0.001167/0.003237 on Base.Generator(::Function, ::Vector{SVector{100, Int64}}, ::Vector{SVector{100, Int64}})
 InferenceTiming: 0.001316/0.017500 on ==(::SVector{100, Int64}, ::SVector{100, Int64})
 InferenceTiming: 0.001600/0.007299 on collect(::Base.Generator{UnitRange{Int64}, Base.var"#180#181"{Base.Iterators.var"#7#8"{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}}})
 InferenceTiming: 0.001611/0.033633 on collect(::Base.Generator{Base.Iterators.Zip{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}})
 InferenceTiming: 0.001777/0.021903 on Base._iterator_upper_bound(::Base.Generator{Base.Iterators.Zip{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}, Base.var"#4#5"{typeof(==)}})
 InferenceTiming: 0.001970/0.015039 on ntuple(#7::Base.Iterators.var"#7#8"{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}, ::Int64)
 InferenceTiming: 0.002186/0.027022 on collect(::Base.Generator{Base.Iterators.Zip{Tuple{Vector{SVector{100, Int64}}, Vector{SVector{100, Int64}}}}, Base.var"#4#5"{typeof(==)}})
 InferenceTiming: 13.431963/13.496669 on Core.Compiler.Timings.ROOT()

The majority of time is not spent on inference (ROOT takes 99.5% of time). So, should be mostly code-generation? Not really sure what could be done to fix this on the StaticArrays side - probably more of a Base thing? Also, this is only really bad when the considered SVector is very large.

thchr avatar Jan 10 '22 15:01 thchr

Agree that this may be an unrealistically large SVector. I encountered this in a code base that I have since re-factored, but it left me curious because it broke my understanding of how the compiled specialization of ==(::SVector, ::SVector) would be re-used.

Would anyone recommend that I move this issue to Base?

alhirzel avatar Jan 10 '22 15:01 alhirzel

The time is almost entirely LLVM. On the Julia side the most expensive thing can be precompiled as:

precompile(Tuple{typeof(Base.collect), Base.Generator{Base.Iterators.Zip{Tuple{Array{StaticArrays.SArray{Tuple{100}, Int64, 1, 100}, 1}, Array{StaticArrays.SArray{Tuple{100}, Int64, 1, 100}, 1}}}, Base.var"#4#5"{typeof(Base.:(==))}}})

and profiling that call gives something like that:

Overhead ╎ [+additional indent] Count File:Line; Function
=========================================================
    1╎1     /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZL17GroupByComplexityRN4llvm15SmallVectorImplIPKNS_4SCEVEEEPNS_8LoopInfoERNS_13DominatorTreeE
    2╎2     /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZL21CompareSCEVComplexityRN4llvm18EquivalenceClassesIPKNS_4SCEVEEERNS0_IPKNS_5ValueEEEPKNS_8LoopInfoES3_S3_RNS_13DominatorTreeEj
    1╎1     /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm16MetadataTracking7untrackEPvRNS_8MetadataE
    1╎1     /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZNK4llvm11Instruction15getMetadataImplEj
    1╎1     /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZNK4llvm19FoldingSetNodeIDRef11ComputeHashEv
    1╎1     /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZNSt8_Rb_treeIN4llvm18EquivalenceClassesIPKNS0_4SCEVEE7ECValueES6_St9_IdentityIS6_ESt4lessIS6_ESaIS6_EE8_M_eraseEPSt13_Rb_tree_nodeIS6_E
    2╎2     /lib/x86_64-linux-gnu/libc.so.6:?; __libc_malloc
     ╎10087 [unknown stackframe]
     ╎ 10083 [unknown stackframe]
     ╎  10083 [unknown stackframe]
    8╎   10083 [unknown stackframe]
     ╎    9948  julia:?; 
     ╎     9948  /lib/x86_64-linux-gnu/libc.so.6:?; __libc_start_main
     ╎    ╎ 9948  julia:?; main
     ╎    ╎  9948  /buildworker/worker/package_linux64/build/src/jlapi.c:701; jl_repl_entrypoint
     ╎    ╎   9948  /buildworker/worker/package_linux64/build/src/jlapi.c:559; true_main
     ╎    ╎    9948  /buildworker/worker/package_linux64/build/src/julia.h:1788; jl_apply
     ╎    ╎     9948  /buildworker/worker/package_linux64/build/src/gf.c:2429; jl_apply_generic
     ╎    ╎    ╎ 9948  /buildworker/worker/package_linux64/build/src/gf.c:2247; _jl_invoke
     ╎    ╎    ╎  9948  /home/mateusz/bin/julia-1.7.0/lib/julia/sys.so:?; jfptr__start_43127.clone_1
     ╎    ╎    ╎   9948  @Base/client.jl:495; _start()
     ╎    ╎    ╎    9948  @Base/client.jl:309; exec_options(opts::Base.JLOptions)
     ╎    ╎    ╎     9948  @Base/client.jl:379; run_main_repl(interactive::Bool, quiet::Bool, banner::Bool, history_file::Bool, color_set::Bool)
     ╎    ╎    ╎    ╎ 9948  @Base/essentials.jl:714; invokelatest
     ╎    ╎    ╎    ╎  9948  @Base/essentials.jl:716; #invokelatest#2
     ╎    ╎    ╎    ╎   9948  /buildworker/worker/package_linux64/build/src/builtins.c:757; jl_f__call_latest
     ╎    ╎    ╎    ╎    9948  /buildworker/worker/package_linux64/build/src/julia.h:1788; jl_apply
     ╎    ╎    ╎    ╎     9948  /buildworker/worker/package_linux64/build/src/gf.c:2429; jl_apply_generic
     ╎    ╎    ╎    ╎    ╎ 9948  /buildworker/worker/package_linux64/build/src/gf.c:2247; _jl_invoke
     ╎    ╎    ╎    ╎    ╎  9948  /home/mateusz/bin/julia-1.7.0/lib/julia/sys.so:?; jfptr_YY.930_32578.clone_1
     ╎    ╎    ╎    ╎    ╎   9948  @Base/client.jl:394; (::Base.var"#930#932"{Bool, Bool, Bool})(REPL::Module)
     ╎    ╎    ╎    ╎    ╎    9948  /buildworker/worker/package_linux64/build/src/gf.c:2429; jl_apply_generic
     ╎    ╎    ╎    ╎    ╎     9948  /buildworker/worker/package_linux64/build/src/gf.c:2247; _jl_invoke
     ╎    ╎    ╎    ╎    ╎    ╎ 9948  /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:349; run_repl(repl::REPL.AbstractREPL, consumer::Any)
     ╎    ╎    ╎    ╎    ╎    ╎  9948  /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:362; run_repl(repl::REPL.AbstractREPL, consumer::Any; backend_on_current_task::Bool)
     ╎    ╎    ╎    ╎    ╎    ╎   9948  /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:229; start_repl_backend(backend::REPL.REPLBackend, consumer::Any)
     ╎    ╎    ╎    ╎    ╎    ╎    9948  /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:244; repl_backend_loop(backend::REPL.REPLBackend)
     ╎    ╎    ╎    ╎    ╎    ╎     9948  /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:150; eval_user_input(ast::Any, backend::REPL.REPLBackend)
     ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9948  @Base/boot.jl:373; eval
     ╎    ╎    ╎    ╎    ╎    ╎    ╎  9948  /buildworker/worker/package_linux64/build/src/toplevel.c:944; jl_toplevel_eval_in
     ╎    ╎    ╎    ╎    ╎    ╎    ╎   9948  /buildworker/worker/package_linux64/build/src/toplevel.c:830; jl_toplevel_eval_flex
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    9948  /buildworker/worker/package_linux64/build/src/toplevel.c:885; jl_toplevel_eval_flex
     ╎    ╎    ╎    ╎    ╎    ╎    ╎     9948  /buildworker/worker/package_linux64/build/src/interpreter.c:731; jl_interpret_toplevel_thunk
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9947  /buildworker/worker/package_linux64/build/src/interpreter.c:516; eval_body
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  9947  /buildworker/worker/package_linux64/build/src/interpreter.c:461; eval_body
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   9947  /buildworker/worker/package_linux64/build/src/interpreter.c:215; eval_value
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    9947  /buildworker/worker/package_linux64/build/src/interpreter.c:126; do_call
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     9947  /buildworker/worker/package_linux64/build/src/julia.h:1788; jl_apply
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9947  /buildworker/worker/package_linux64/build/src/gf.c:2429; jl_apply_generic
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  9947  /buildworker/worker/package_linux64/build/src/gf.c:2247; _jl_invoke
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   9947  /home/mateusz/bin/julia-1.7.0/lib/julia/sys.so:?; jfptr_precompile_19408.clone_1
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    9947  @Base/loading.jl:1936; precompile(argt::Type)
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     9947  /buildworker/worker/package_linux64/build/src/gf.c:2173; jl_compile_hint
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9947  /buildworker/worker/package_linux64/build/src/gf.c:1921; jl_compile_method_internal
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  9947  /buildworker/worker/package_linux64/build/src/gf.c:1980; jl_compile_method_internal
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   9881  /buildworker/worker/package_linux64/build/src/jitlayers.cpp:350; jl_generate_fptr
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    9879  /buildworker/worker/package_linux64/build/src/jitlayers.cpp:154; _jl_compile_codeinst
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     9879  /buildworker/worker/package_linux64/build/src/jitlayers.cpp:1125; jl_add_to_ee
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9879  /buildworker/worker/package_linux64/build/src/jitlayers.cpp:1103; jl_add_to_ee
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  9879  /buildworker/worker/package_linux64/build/src/jitlayers.cpp:1059; jl_add_to_ee
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   9879  /buildworker/worker/package_linux64/build/src/jitlayers.cpp:779; addModule
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    9879  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm3orc16ExecutionSession6lookupENS_8ArrayRefIPNS0_8JITDylibEEENS_9StringRefENS0_11SymbolStateE
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     9879  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm3orc16ExecutionSession6lookupENS_8ArrayRefIPNS0_8JITDylibEEENS0_15SymbolStringPtrENS0_11SymbolStateE
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9879  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm3orc16ExecutionSession6lookupERKSt6vectorISt4pairIPNS0_8JITDylibENS0_19JITDylibLookupFlagsEESaIS7_EENS0_15SymbolStringPtrENS0_11SymbolStateE
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  9879  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm3orc16ExecutionSession6lookupERKSt6vectorISt4pairIPNS0_8JITDylibENS0_19JITDylibLookupFlagsEESaIS7_EERKNS0_15SymbolLookupSetENS0_10LookupKindENS0_11SymbolStateESt8functionIFvRKNS_8DenseMapIS5_NS_8DenseSetINS0_15SymbolStringPtrENS_12DenseMapIn...
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   9879  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm3orc16ExecutionSession6lookupENS0_10LookupKindERKSt6vectorISt4pairIPNS0_8JITDylibENS0_19JITDylibLookupFlagsEESaIS8_EENS0_15SymbolLookupSetENS0_11SymbolStateENS_15unique_functionIFvNS_8ExpectedINS_8DenseMapINS0_15SymbolStringPtrENS_18JITEval...
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    9879  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm3orc16ExecutionSession19OL_applyQueryPhase1ESt10unique_ptrINS0_21InProgressLookupStateESt14default_deleteIS3_EENS_5ErrorE
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     9879  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm3orc25InProgressFullLookupState8completeESt10unique_ptrINS0_21InProgressLookupStateESt14default_deleteIS3_EE
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9879  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm3orc16ExecutionSession17OL_completeLookupESt10unique_ptrINS0_21InProgressLookupStateESt14default_deleteIS3_EESt10shared_ptrINS0_23AsynchronousSymbolQueryEESt8functionIFvRKNS_8DenseMapIPNS0_8JITDylibENS_8DenseSetINS0_15SymbolStringPtrENS_1...
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  9879  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm3orc16ExecutionSession22dispatchOutstandingMUsEv
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   9879  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZNSt17_Function_handlerIFvSt10unique_ptrIN4llvm3orc19MaterializationUnitESt14default_deleteIS3_EES0_INS2_29MaterializationResponsibilityES4_IS7_EEEPSA_E9_M_invokeERKSt9_Any_dataOS6_OS9_
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    9879  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm3orc16ExecutionSession26materializeOnCurrentThreadESt10unique_ptrINS0_19MaterializationUnitESt14default_deleteIS3_EES2_INS0_29MaterializationResponsibilityES4_IS7_EE
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     9879  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm3orc31BasicIRLayerMaterializationUnit11materializeESt10unique_ptrINS0_29MaterializationResponsibilityESt14default_deleteIS3_EE
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9879  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm3orc14IRCompileLayer4emitESt10unique_ptrINS0_29MaterializationResponsibilityESt14default_deleteIS3_EENS0_16ThreadSafeModuleE
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  9878  /buildworker/worker/package_linux64/build/src/jitlayers.cpp:612; operator()
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   9878  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    9544  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN12_GLOBAL__N_113CGPassManager11runOnModuleERN4llvm6ModuleE
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     9544  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9517  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN12_GLOBAL__N_113JumpThreading13runOnFunctionERN4llvm8FunctionE.part.687
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  9517  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm17JumpThreadingPass7runImplERNS_8FunctionEPNS_17TargetLibraryInfoEPNS_13LazyValueInfoEPNS_9AAResultsEPNS_14DomTreeUpdaterEbSt10unique_ptrINS_18BlockFrequencyInfoESt14default_deleteISC_EESB_INS_21BranchProbabilityInfoESD_ISG_EE
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   9461  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm17JumpThreadingPass12processBlockEPNS_10BasicBlockE
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    9358  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm17JumpThreadingPass22processThreadableEdgesEPNS_5ValueEPNS_10BasicBlockENS_13jumpthreading18ConstantPreferenceEPNS_11InstructionE.part.683
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     9354  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm17JumpThreadingPass13tryThreadEdgeEPNS_10BasicBlockERKNS_15SmallVectorImplIS2_EES2_
     ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9352  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm17JumpThreadingPass10threadEdgeEPNS_10BasicBlockERKNS_15SmallVectorImplIS2_EES2_
    4╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  9086  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm17JumpThreadingPass9updateSSAEPNS_10BasicBlockES2_RNS_8DenseMapIPNS_11InstructionEPNS_5ValueENS_12DenseMapInfoIS5_EENS_6detail12DenseMapPairIS5_S7_EEEE
    1╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   9053  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm10SSAUpdater10RewriteUseERNS_3UseE
    1╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    1076  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm10SSAUpdater23GetValueInMiddleOfBlockEPNS_10BasicBlockE
   39╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     1075  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm10SSAUpdater28GetValueAtEndOfBlockInternalEPNS_10BasicBlockE
  276╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    7976  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm10SSAUpdater28GetValueAtEndOfBlockInternalEPNS_10BasicBlockE
  970╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     1349  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm14SSAUpdaterImplINS_10SSAUpdaterEE14BuildBlockListEPNS_10BasicBlockEPNS_15SmallVectorImplIPNS2_6BBInfoEEE
 6117╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     6340  /home/mateusz/bin/julia-1.7.0/bin/../lib/julia/libLLVM-12jl.so:?; _ZN4llvm14SSAUpdaterImplINS_10SSAUpdaterEE17FindAvailableValsEPNS_15SmallVectorImplIPNS2_6BBInfoEEE

No idea why that particular thing is so expensive for LLVM, this would need some input from an LLVM expert. I don't know if there is any interest in fixing this in Base but maybe there is a reasonable workaround that could be implemented in StaticArrays.jl.

mateuszbaran avatar Jan 10 '22 21:01 mateuszbaran

Thank you for these insights. This helps my understanding, at least confirming my intuition was reasonable, and that an explanation is definitely beyond me! I'm not sure when it would be best to close this issue / move it. Thought of a few questions for the air: who are examples of people in the LLVM community that may be interested in / capable of taking a closer look? And if StaticArrays.jl had some work-around, would it basically look like unrolling the map operation? I tried to locate the code for map, but had trouble locating it as I don't have enough familiarity with the julia codebase (and maybe it's implemented at some lower level, etc.)

alhirzel avatar Jan 11 '22 01:01 alhirzel

who are examples of people in the LLVM community that may be interested in / capable of taking a closer look?

I guess you could ask in the internals channel of Julia Slack, many people there know some LLVM.

And if StaticArrays.jl had some work-around, would it basically look like unrolling the map operation?

map in StaticArrays.jl is already unrolled but this: map(==, [SVector(1:100...)], [SVector(1:100...)]); doesn't use map from StaticArrays.jl but the generic implementation from Base (since you map over normal Vectors). We could likely catch this particular case in StaticArrays.jl and output a code that is easier for LLVM.

There are a few tools for inspecting what methods are called. Personally I like to use Cthulhu for quick exploration, for example:

julia> using StaticArrays, Cthulhu

julia> map(==, [SVector(1:100...)], [SVector(1:100...)]);

julia> @descend map(==, [SVector(1:100...)], [SVector(1:100...)])

mateuszbaran avatar Jan 11 '22 07:01 mateuszbaran