julia
julia copied to clipboard
[rfc] parallel marking
This PR extends the refactoring from https://github.com/JuliaLang/julia/pull/45608 by parallelizing the GC mark-loop.
Some benchmark results are here: https://hackmd.io/@Idnmfpb3SxK98-OsBtRD5A/HJn86Bl25
TODO
- [x] Move work-stealing queue implementation to separate file.
- [x] Run more benchmarks (and test scalability for large number of threads).
- [x] Fix GC debugging infrastructure & annotations.
CC: @vchuravy, @chflood, @kpamnany
can we get 32 core benchmarks as well? It would be very unfortunate if this was a regression for cpus with many cores. (it would be fine to limit GC to use 4 if thats all it can use, but negative scaling is very unfortunate)
(EDIT: see scaling plots below)
Were these timings made on a version of Julia that includes the changes https://github.com/JuliaLang/julia/pull/45714 that make the GC time report correctly?
No. Will merge and update that.
The squash went pretty bad?
yeah it did, hopefully fixed in the latest commit
Latest commits implement @vchuravy and @chflood's idea of batching big arrays (bugs are mine). Seems to be working fine on my machine with the many_refs
benchmark recently added to GCBenchmarks
. Might need more stress-testing though.
@nanosoldier runtests(ALL, vs = ":master", buildflags=["LLVM_ASSERTIONS=1", "FORCE_ASSERTIONS=1"], vs_buildflags=["LLVM_ASSERTIONS=1", "FORCE_ASSERTIONS=1"])
@nanosoldier runbenchmarks(!"scalar", vs=":master")
Something went wrong when running your job:
NanosoldierError: failed to run benchmarks against primary commit: failed process: Process(`sudo -n /nanosoldier/cset/bin/cset shield -e -- sudo -n -u nanosoldier-worker -- /nanosoldier/workdir/jl_Ru6OqQ/benchscript.sh`, ProcessSignaled(6)) [0]
Logs and partial data can be found here
@nanosoldier runbenchmarks(!"scalar", vs=":master")
@nanosoldier runbenchmarks(!"scalar", vs=":master")
Looks like there are a couple more key packages that need fixing from glancing at some parts of that report, including FFTW, HTTP, and Parsers. The HTTP and Parsers code looks like it was probably already very dangerous (looks like it probably may be introducing data races into the code). The FFTW code is just trivially deciding if thread support is enabled, and that call can be removed: https://github.com/JuliaMath/FFTW.jl/blob/17bc81a0fcf9875d777ea4bee2fca70fc23c8a0c/src/providers.jl#L75
Were there any regressions? Can you post the before/after?
@nanosoldier runtests(ALL, vs = ":master", buildflags=["LLVM_ASSERTIONS=1", "FORCE_ASSERTIONS=1"], vs_buildflags=["LLVM_ASSERTIONS=1", "FORCE_ASSERTIONS=1"])
What is the status of this now that https://github.com/JuliaLang/julia/pull/21590 was merged?
I think it's not ready to merge/review yet. It's still causing regressions on some of the GCBenchmarks and I'm diagnosing that (will post some scaling plots in a sec).
I think @kpamnany also saw some segfaults on RAI tests.
Speedup/slowdowns on GC times relative to the master commit from which it branched vs nthreads (1, 2, 4, 8, 16):
Superseded by https://github.com/JuliaLang/julia/pull/48600.