VegaDatasets.jl icon indicating copy to clipboard operation
VegaDatasets.jl copied to clipboard

Lower optimizations takes two (five) seconds of loading

Open PallHaraldsson opened this issue 5 years ago • 7 comments

Same opt. trick as does to Plots.jl could help here (and I guess VegaLite, and more). [With this package in "extras" does it mean it's automatically loaded by Queryverse?]

And, the "second using problem" is a hint about invalidations, that I believe SnoopCompile could fix, helping further:

$ time ~/julia-1.6-DEV-latest-7c980c6af5/bin/julia -O0 --startup-file=no
julia> @time using VegaDatasets
  5.633995 seconds (12.02 M allocations: 627.345 MiB, 5.30% gc time)

julia> @time using VegaDatasets
  0.874366 seconds (2.20 M allocations: 109.821 MiB, 5.77% gc time)

julia> @time using VegaDatasets
  0.000496 seconds (477 allocations: 28.578 KiB)

PallHaraldsson avatar Jun 03 '20 22:06 PallHaraldsson

Yes, I'm sure that could help a lot! PRs welcome.

davidanthoff avatar Jun 04 '20 00:06 davidanthoff

It's more non-trivial than I thought, you need the opt I did (and/or some other with e.g. SnoopCompile.jl) in some (one or more) of your dependencies (and maybe not wanted there?), only at this package wasn't effective so I close the PR.

What you would really want is something like (not yet possible, in the code):

$ ~/julia-1.6-DEV-latest-7c980c6af5/bin/julia --startup-file=no -O0 --compile=min -q

julia> @time using VegaDatasets
  2.441897 seconds (3.31 M allocations: 202.032 MiB, 1.18% gc time)

vs:

julia> @time using VegaDatasets
  7.573305 seconds (12.37 M allocations: 645.933 MiB, 4.22% gc time)

julia> @time using VegaDatasets
  1.501893 seconds (2.20 M allocations: 109.827 MiB, 1.87% gc time)

julia> @time using VegaDatasets
  0.000447 seconds (582 allocations: 34.594 KiB)

PallHaraldsson avatar Jun 08 '20 16:06 PallHaraldsson

The biggest offender is (would you ever, or usually, use the capability for Vega? or in TextParse usually?):

https://github.com/JuliaMath/DoubleFloats.jl/issues/109#issue-634823325

I thought about making an issue about https://github.com/queryverse/TextParse.jl your slowest dependency, but its dependency (the above) was to blame. It's a deep rabbit hole. Since it's you package, or under your umbrella, you may want to look into it (should [I/someone] file an issue there, since 90% of its slow startup if from its dependency): with I think using Requires.jl? I've never used it, but I think it's for such.

PallHaraldsson avatar Jun 08 '20 18:06 PallHaraldsson

Ah, super interesting! DoubleFloat.jl is quite key for the float parsing in TextParse.jl. And float parsing in general seems so central that I don't really see how we could remove that... So that probably leaves us with fixing the invalidations in DoubleFloats.jl, right? That sounds non-trivial...

davidanthoff avatar Jun 10 '20 22:06 davidanthoff

Ok, I think I found out how, and I showed how in the (now) latest comment: https://github.com/JuliaMath/DoubleFloats.jl/issues/109#issuecomment-640957870 it just needs to be added/tested.

PallHaraldsson avatar Jun 11 '20 00:06 PallHaraldsson

FYI:

Timing is getting worse (for me), and former is with -O0:

julia> @time using VegaDatasets
  7.311773 seconds (16.16 M allocations: 840.713 MiB, 5.37% gc time)


julia> @time using VegaDatasets
 10.124182 seconds (16.16 M allocations: 840.713 MiB, 3.85% gc time)

I recently added some package, and it downgraded some others. It might be the reason. I'll look a bit more into it, jus shoing you what users can get.

PallHaraldsson avatar Jul 09 '20 11:07 PallHaraldsson

FYI on Julia 1.6 master (the former should be doable with an updated PR with new features in 1.6)

$ julia -O0 --compile=min -q
julia> @time using VegaDatasets
  1.273415 seconds (1.42 M allocations: 96.934 MiB)

with defaults:
  5.133299 seconds (5.92 M allocations: 355.355 MiB, 1.60% gc time)

It's a bit ironic compared to the code (with same options, the last default):

julia> @time using VegaLite
  1.005628 seconds (1.04 M allocations: 72.780 MiB)

$ julia -O1 -q  # I would add this option to VegaLite:
julia> @time using VegaLite
  2.420783 seconds (4.38 M allocations: 264.755 MiB, 3.41% gc time)

julia> @time using VegaLite
  3.274863 seconds (4.38 M allocations: 264.758 MiB, 2.81% gc time)

So, Queryverse 3.4 sec. to 12.4 on defaults.

PallHaraldsson avatar Aug 27 '20 12:08 PallHaraldsson