Plots.jl icon indicating copy to clipboard operation
Plots.jl copied to clipboard

Effect of precompilation

Open timholy opened this issue 5 years ago • 27 comments

https://discourse.julialang.org/t/suddenly-cannot-use-plots-package/42385

It would be sad to turn off precompilation entirely on older Julia versions, but unless you can track down the issue more specifically it's the easiest way to fix the problem.

timholy avatar Jul 02 '20 07:07 timholy

Actually, is this even making use of the precompilation from SnoopCompileBot? I don't see any place where the source code says include("precompile_includer.jl"). @aminya, is this missing from the bot docs?

timholy avatar Jul 02 '20 08:07 timholy

You're right, probably my fault.

daschw avatar Jul 02 '20 08:07 daschw

Wait, this is strange. The bot should add include("precompile_includer.jl") automatically. I wonder it has failed to do so! 🤔

The log says it has done it! But it seems it is lost somewhere in the process. I need to test this offline and will get back to you.

https://github.com/JuliaPlots/Plots.jl/runs/825527122?check_suite_focus=true#step:7:10

[ Info: SnoopCompile will try to write  "include("precompile_includer.jl")" before end of the module in /home/runner/work/Plots.jl/Plots.jl/src/Plots.jl. Assume that the last `end` is the end of a module.

aminya avatar Jul 02 '20 08:07 aminya

Also, seems unlikely to be the source of the crash reported in discourse, but do be careful about crashes in versions older than 1.4.2 (see the Julia PR linked from discourse).

Also, permit me a couple of SnoopCompile tutorials. First, let's see how well precompiation is working: after having added the include("precompile_includer.jl"), I get this:

julia> using SnoopCompileCore

julia> tinf = @snoopi tmin=0.01 (using Plots; p = plot(rand(10)); display(p))
33-element Array{Tuple{Float64,Core.MethodInstance},1}:
 (0.01029205322265625, MethodInstance for (::Core.var"#Type##kw")(::NamedTuple{(:libc, :compiler_abi),Tuple{Nothing,Pkg.BinaryPlatforms.CompilerABI}}, ::Type{Pkg.BinaryPlatforms.FreeBSD}, ::Symbol))
 (0.010421991348266602, MethodInstance for foreach(::Function, ::Array{Dict{Symbol,Any},1}, ::Array{Dict{Symbol,Any},1}))
 (0.010580062866210938, MethodInstance for warn_on_recipe_aliases!(::Plots.Plot{Plots.GRBackend}, ::Dict{Symbol,Any}, ::Symbol, ::Array{Float64,1}))
 (0.011501073837280273, MethodInstance for #any#651(::Function, ::typeof(any), ::Function, ::Array{Float64,1}))
 (0.011873960494995117, MethodInstance for get!(::Base.var"#137#138"{Set{Symbol}}, ::Dict{Symbol,Set{Symbol}}, ::Symbol))
 (0.01209402084350586, MethodInstance for resize_nthreads!(::Array{HTTP.URIs.RegexAndMatchData,1}, ::HTTP.URIs.RegexAndMatchData))
 (0.012480974197387695, MethodInstance for sum(::Array{Measures.Length{:mm,Float64},2}))
 (0.012552022933959961, MethodInstance for setindex!(::Dict{Any,Plots.Subplot}, ::Plots.Subplot{Plots.GRBackend}, ::Symbol))
 (0.012843132019042969, MethodInstance for setindex!(::Dict{Symbol,PlotThemes.PlotTheme}, ::PlotThemes.PlotTheme, ::Symbol))
 (0.014549016952514648, MethodInstance for setindex!(::Dict{Plots.Subplot,Any}, ::Dict{Symbol,Any}, ::Plots.Subplot{Plots.GRBackend}))
 (0.014793872833251953, MethodInstance for #gr_set_font#334(::Symbol, ::Symbol, ::RGB{FixedPointNumbers.Normed{UInt8,8}}, ::Float64, ::typeof(Plots.gr_set_font), ::Plots.Font, ::Plots.Subplot{Plots.GRBackend}))
 (0.015500068664550781, MethodInstance for #gr_set_font#334(::Symbol, ::Symbol, ::RGB{FixedPointNumbers.Normed{UInt8,8}}, ::Int64, ::typeof(Plots.gr_set_font), ::Plots.Font, ::Plots.Subplot{Plots.GRBackend}))
 (0.015724897384643555, MethodInstance for append!(::Array{RecipeData,1}, ::Array{RecipeData,1}))
 (0.016112089157104492, MethodInstance for _collect(::UnitRange{Int64}, ::Base.Iterators.Flatten{Tuple{Base.KeySet{Symbol,Dict{Symbol,Any}},Set{Symbol}}}, ::Base.HasEltype, ::Base.SizeUnknown))
 (0.016675949096679688, MethodInstance for _unsafe_copyto!(::Array{Plots.Subplot,1}, ::Int64, ::Array{Plots.Subplot{Plots.GRBackend},1}, ::Int64, ::Int64))
 (0.018754005432128906, MethodInstance for maximum(::Array{Measures.Length{:mm,Float64},1}))
 (0.019168853759765625, MethodInstance for _mapreduce_dim(::Function, ::Function, ::Base._InitialValue, ::Array{Measures.Length{:mm,Float64},2}, ::Int64))
 (0.019948959350585938, MethodInstance for mapreducedim!(::Function, ::Function, ::Array{Measures.Length{:mm,Float64},2}, ::Array{Measures.Length{:mm,Float64},2}))
 (0.020833969116210938, MethodInstance for get!(::Requires.var"#1#2", ::Dict{Base.PkgId,Array{Function,1}}, ::Base.PkgId))
 (0.022881031036376953, MethodInstance for *(::Int64, ::Array{Measures.Length{:mm,Float64},1}))
 (0.023634910583496094, MethodInstance for collect_to_with_first!(::Array{NTuple{4,Measures.Length{:mm,Float64}},2}, ::NTuple{4,Measures.Length{:mm,Float64}}, ::Base.Generator{Array{AbstractLayout,2},typeof(Plots._update_min_padding!)}, ::Int64))
 (0.024197101593017578, MethodInstance for Plots.GridLayout(::Int64, ::Vararg{Int64,N} where N))
 (0.025114059448242188, MethodInstance for apply_recipe(::Dict{Symbol,Any}, ::Array{Float64,1}))
 (0.03254199028015137, MethodInstance for warn_on_recipe_aliases!(::Plots.Plot{Plots.GRBackend}, ::Array{RecipeData,1}, ::Symbol, ::Array{Float64,1}))
 (0.03902387619018555, MethodInstance for resize_nthreads!(::Array{Dict{Sockets.IPAddr,HTTP.Servers.RateLimit},1}, ::Dict{Sockets.IPAddr,HTTP.Servers.RateLimit}))
 (0.04192495346069336, MethodInstance for +(::Array{Measures.Length{:mm,Float64},2}, ::Array{Measures.Length{:mm,Float64},2}))
 (0.056591033935546875, MethodInstance for _parse_colorant(::String))
 (0.0628499984741211, MethodInstance for showoff(::Array{Float64,1}, ::Symbol))
 (0.06933903694152832, MethodInstance for apply_recipe(::Dict{Symbol,Any}, ::Type{RecipesPipeline.SliceIt}, ::Nothing, ::Array{Float64,1}, ::Nothing))
 (0.07117319107055664, MethodInstance for recipe_pipeline!(::Plots.Plot{Plots.GRBackend}, ::Dict{Symbol,Any}, ::Tuple{Array{Float64,1}}))
 (0.09401392936706543, MethodInstance for (::Base.var"#maximum##kw")(::NamedTuple{(:dims,),Tuple{Int64}}, ::typeof(maximum), ::Array{Measures.Length{:mm,Float64},2}))
 (0.12085700035095215, MethodInstance for (::PlotUtils.var"#optimize_ticks##kw")(::NamedTuple{(:k_min, :k_max),Tuple{Int64,Int64}}, ::typeof(optimize_ticks), ::Float64, ::Float64))
 (0.17196011543273926, MethodInstance for resize_nthreads!(::Array{Dict{String,Set{HTTP.Cookies.Cookie}},1}, ::Dict{String,Set{HTTP.Cookies.Cookie}}))

This is a measure of stuff that isn't being precompiled because it's what's left over still needing inference. Several of those belong to dependent packages or base Julia. It's not bad at all, about 1.5s of remaining inference time. Main stragglers are in dependent packages like PlotUtils and RecipesPipeline, and it's possible that you could add precompilation to those packages and get them covered, too. (Also, HTTP is probably costing you a lot...what do you use it for?)

Also, one other tip: you might consider playing with the tmin parameter of BotConfig. I noticed once I added the include("precompile_includer.jl") that the time for plot dropped by half a second and the time for loading increased by half a second. Of course it's probably a net win because you've precompiled more than just plot, but do keep in mind that loading the cached inference results does, of course, take some time. You can use the tmin parameter to limit yourself just to things that take enough inference time that it's better to load them than it is to recompute them.

timholy avatar Jul 02 '20 08:07 timholy

Thanks for the hints!

Also, seems unlikely to be the source of the crash reported in discourse, but do be careful about crashes in versions older than 1.4.2 (see the Julia PR linked from discourse)

We run our tests on 1.3.1 on all platforms.

Main stragglers are in dependent packages like PlotUtils and RecipesPipeline, and it's possible that you could add precompilation to those packages and get them covered, too.

I was planning on doing that. Thanks for the reminder!

daschw avatar Jul 02 '20 08:07 daschw

Wait, this is strange. The bot should add include("precompile_includer.jl") automatically. I wonder it has failed to do so! 🤔

The log says it has done it! But it seems it is lost somewhere in the process. I need to test this offline and will get back to you.

I tested this offline, and I added more tests just in case. All pass. Even the artifact uploaded by the action has the "include("precompile_includer.jl")" in it. Maybe the Create_PR step has removed it for some reason!

https://github.com/JuliaPlots/Plots.jl/suites/862271274/artifacts/9944298

OK. This part of the script which tries to discard whitespace changes removes precompile_includer:

(git diff -w --no-color || git apply --cached --ignore-whitespace && git checkout -- . && git reset && git add -p) || echo done

Changing it to this solves the problem:

(git diff -w --no-color || (git diff -w --no-color | git apply --cached --ignore-whitespace && git checkout -- . && git reset && git add -p)) || echo done

aminya avatar Jul 02 '20 09:07 aminya

Http is used to fetch the latest plotly library, but yeah maybe there's a way to make it a system dependency or download it somehow faster

isentropic avatar Jul 08 '20 01:07 isentropic

Is this fixed?

BeastyBlacksmith avatar Jul 23 '20 08:07 BeastyBlacksmith

Doesn't seem so: https://github.com/JuliaPlots/Plots.jl/issues/2881

BeastyBlacksmith avatar Jul 28 '20 15:07 BeastyBlacksmith

I do not think SegFaults are related to this issue. The title is misleading, it was just a mistake for not including the precompilation files. I am waiting for @timholy to merge https://github.com/timholy/SnoopCompile.jl/pull/122 that resolves that.

aminya avatar Jul 28 '20 16:07 aminya

You can switch to the old precompilation backend (using snoopc) by changing the snoop_mode for julia <=1.3:

https://github.com/JuliaPlots/Plots.jl/blob/5673ced5d730dc7645ca01eb753a94cadd22b2a2/deps/SnoopCompile/snoop_bot.jl#L3

@static if VERSION <= v"1.3.9" 
	snoop_bot(
	    BotConfig(
	        "Plots",
	        yml_path= "SnoopCompile.yml",
	        else_os = "linux",
	        else_version = "1.4",
	    ),
	    joinpath(@__DIR__, "precompile_script.jl"),
		snoop_mode = :snoopc			# uses `snoopc` 
	)
else
	snoop_bot(
	    BotConfig(
	        "Plots",
	        yml_path= "SnoopCompile.yml",
	        else_os = "linux",
	        else_version = "1.4",
	    ),
	    joinpath(@__DIR__, "precompile_script.jl"),
	)
end 


aminya avatar Jul 28 '20 16:07 aminya

As @timholy suggested I played around a little with the tmin parameter of @snoopi. I used tmin = [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5] and measured the time to first plot 15 times for each of the julia versions 1.3, 1.4, 1.5 and 1.6. Here are the results: ttfp_vs_version

I am unsure what to conclude from this. It seems like the parameter does not have that much affect compared to the improvements achieved by @timholy and others from one Julia version to another. It looks like 5e-3, 1e-2 or 5e-2 would be the better choices. So I guess we should stay with the default, which I think is 1e-2.

If anyone is interested here are also the lines of code of the resulting precompile.jl files for different tmin loc

and the times of individual commands to the first plot display. individual_times

daschw avatar Aug 28 '20 21:08 daschw

Very interesting!

Just to throw out yet another possibility: writing them by hand (which is what I usually do). I know that sounds like a lot of work, but I find it's not too bad, and it gives you direct feedback about what "works" (meaning, which precompile statements actually take effect and which ones don't). Moreover, hand-written files typically aren't sensitive to quite as many particulars that are liable to change if the package changes, although obviously the bot gets rid of most of that problem.

If you're curious, here are two examples of hand-written precompile files:

  • https://github.com/JuliaImages/ImageCore.jl/blob/master/src/precompile.jl (I take advantage of loops to compactly and deterministically precompile for different eltypes, dimensionalities, etc).
  • and here's one that takes advantage of the new bodyfunction on master: https://github.com/JuliaIO/FileIO.jl/blob/master/src/precompile.jl

The sort of process I use is illustrated here: https://github.com/JuliaImages/ImageQualityIndexes.jl/issues/24#issuecomment-682408957. As you can see, sometimes it's not straightforward to get precompilation to actually take effect (as witnessed by the fact that I haven't yet succeeded in the case in that link); because I'm interested in making precompilation work better, it's useful for me to see feedback on a statement-by-statement basis. I tend to bat around 50% or so, I think. In some cases I've even added a precompile statement in Base to make a package's precompilation work better (e.g., https://github.com/JuliaPackaging/Requires.jl/pull/89). You can really only do that kind of stuff if you're looking carefully at what works and what doesn't.

If you're interested in doing something similar I'm happy to collaborate on it.

timholy avatar Aug 28 '20 23:08 timholy

It looks like 5e-3, 1e-2 or 5e-2 would be the better choices. So I guess we should stay with the default, which I think is 1e-2.

Yes, in SnoopCompileBot 2 the default tmin is 0.01 which I found a good default for most cases compared to 0 in the older versions.

ttfp_vs_version

The Y-axis of this plot does not have much resolution, and so the differences are not visible much. The unit is 2 seconds which makes the comparison hard.

It would be better to focus on the latest version and use a higher resolution on the Y-axis so we can see the differences better.

The most important downside of using a very low tmin is the initial loading time of Plots and also the time it is spent for precompilation in the very first load of Plots. So doing some benchmark for this would reveal more things.

Another thing to look into is to benchmark the actual cost of loading the inference caches (which probably depends on the operating system's IO speed). If for example, loading an inference cache takes 1ms that means using any tmin lower than 0.001 does not make sense. That's probably why tmin 0.0001 has a higher time-to-first-plot. Because loading each of the caches takes more time than the actual time benefit of doing so.

aminya avatar Aug 28 '20 23:08 aminya

Incidentally, the precompilation times of StatsPlots seem to have skyrocketed with newer Julia versions. Not sure what's going on there.

mkborregaard avatar Aug 29 '20 08:08 mkborregaard

Can you define "newer versions"? Meaning 1.5? master in the last two weeks? Or something else?

timholy avatar Aug 29 '20 09:08 timholy

Sorry. I meant 1.4 and more pronounced in 1.5. Let me just do some timings.

mkborregaard avatar Aug 29 '20 09:08 mkborregaard

@aminya, here are the results for 1.6 with higher y resolution. ttfp_v1 6

@timholy Thanks for your hint to the possibility and advantages of manually written precompile files. I just read your blog post on invalidations yesterday and I thinks I still have improve my unterstanding of these things and re-read SnoopCompile docs, before I could get this started. Thanks a lot for offering to collaborate! I will have a look at the examples you provided and let you know if I want to try this.

daschw avatar Aug 29 '20 11:08 daschw

OK sorry for just making a remark out of hand. I just checked, and the pattern is the other way around -

;add StatsPlots
@time using StatsPlots

takes 180 seconds on 1.3, 120 seconds on 1.4 and 90 seconds on 1.5, so it's actually improved remarkably.

mkborregaard avatar Aug 29 '20 13:08 mkborregaard

@daschw could you share the code/config for those plots? They are really attractive and its a useful design for many comparison-style summaries

mahiki avatar Aug 31 '20 01:08 mahiki

Sure,

using CSV
using DataFrames
using StatsPlots
# using Statistics

data = sort(DataFrame(CSV.File(joinpath(@__DIR__, "benchmarks.csv"))))

# set backend with defaults
pgfplotsx(
    bg_inside = colorant"#29353b",
    bg_outside = colorant"#222d32",
    bg_legend = colorant"#222d32",
    fg = colorant"#cfd8dc",
    fg_grid = colorant"#cfd8dc",
    fg_legend = plot_color(colorant"#cfd8dc", 0),
    fg_axis = colorant"#cfd8dc",
    minorgridcolor = colorant"#cfd8dc",
    palette = :Spectral_8,
    dpi = 200,
    markerstrokecolor = :auto,
    yminorgrid = true,
    framestyle = :grid,
)

@df data groupedboxplot(
    :JuliaVersion,
    :UsingTime .+ :PlotTime .+ :DisplayTime,
    group = :tmin,
    legendtitle = " tmin [s]",
    xguide = "Julia Version",
    yguide = "TTFP [s]",
    markersize = 3,
    linewidth = 0.5,
    title = "Total TTFP for different choices of tmin in @snoopi",
    palette = palette(:Spectral_8),
)

You can find the data here.

daschw avatar Aug 31 '20 07:08 daschw

This represents ...soo... much time spent learning, when i have none. Thank you!!

mahiki avatar Sep 02 '20 03:09 mahiki

@mkborregaard, my machine is slower than yours (or we started from different foundations, I started from an empty package repository), but for comparison:

Julia 1.5: 160s Julia 1.6 (recent src build): 105s

so the trend of improvement continues.

timholy avatar Sep 02 '20 10:09 timholy

Great :-) I wonder why StatsPlots is so crazy though.

mkborregaard avatar Sep 02 '20 10:09 mkborregaard

This represents ...soo... much time spent learning, when i have none. Thank you!!

To be fair, @daschw is quite familiar with the Plots package... :trollface:

mkborregaard avatar Sep 02 '20 10:09 mkborregaard

It's not so crazy, look at how many packages it precompiled to just to precompile StatsPlots:

tim@diva:/tmp/pkgs/compiled/v1.6$ ls
AbstractFFTs    ColorSchemes                  Distributions        FreeType2_jll                JSON            MultivariateStats  OpenSSL_jll         QuadGK           Showoff            TableOperations
Adapt           ColorTypes                    DocStringExtensions  FriBidi_jll                  KernelDensity   NaNMath            Opus_jll            Ratios           SortingAlgorithms  Tables
Arpack          CompilerSupportLibraries_jll  FFMPEG               GeometryBasics               LAME_jll        NearestNeighbors   OrderedCollections  RecipesBase      SpecialFunctions   TableTraits
Arpack_jll      DataAPI                       FFMPEG_jll           GeometryTypes                libass_jll      Observables        Parsers             RecipesPipeline  StaticArrays       Widgets
AxisAlgorithms  DataStructures                FFTW                 GR                           libfdk_aac_jll  OffsetArrays       PDMats              Reexport         StatsBase          WoodburyMatrices
Bzip2_jll       DataValueInterfaces           FFTW_jll             Interpolations               libvorbis_jll   Ogg_jll            Plots               Requires         StatsFuns          x264_jll
Clustering      DataValues                    FillArrays           IteratorInterfaceExtensions  Measures        OpenBLAS_jll       PlotThemes          Rmath            StatsPlots         x265_jll
Colors          Distances                     FixedPointNumbers    IterTools                    Missings        OpenSpecFun_jll    PlotUtils           Rmath_jll        StructArrays       Zlib_jll

Remember, I started from a blank slate and only did add StatsPlots; using StatsPlots so each one of these got compiled solely to make StatsPlots.

Your typical C/C++ project would take quite a lot of time to compile that much code.

timholy avatar Sep 02 '20 10:09 timholy

xref https://github.com/JuliaLang/julia/issues/37509

timholy avatar Sep 10 '20 12:09 timholy

Reworked by https://github.com/JuliaPlots/Plots.jl/pull/4334.

t-bltg avatar Nov 29 '22 14:11 t-bltg