AbstractAlgebra.jl
AbstractAlgebra.jl copied to clipboard
Matrix-Strassen doctest needs excessive time (needs fix and re-enabling)
See e.g. https://github.com/Nemocas/AbstractAlgebra.jl/actions/runs/14963175106/job/42028470785
The doctest step in the 1.10, ubuntu-latest step needed 15min, while the 1.10, macOS-latest step needed 64min.
Similar timings can be seen for other invocations of CI as well.
Any idea what's causing this?
cc @fingolfin @benlorenz
Not sure what happens in CI, but when testing locally on munk (with the documenter_helpers.jl from Oscar) most of the time of the doctest run was spent in src/Matrix-Strassen.jl:15-28:
page: src/Matrix-Strassen.jl:15-28
429.511165 seconds (12.88 G allocations: 233.800 GiB, 36.81% gc time, 0.49% compilation time)
(this is out of 8m11.5s total)
I pasted the code block in a julia 1.10 repl on my linux machine as well and it is still running after 13 minutes.
I wanted to re-run the doctests but munk kicked me out of my sessions and doesn't respond anymore:
page: src/Matrix-Strassen.jl:15-28
Connection to munk.mathematik.uni-kl.de closed by remote host.
Connection to munk.mathematik.uni-kl.de closed.
Edit: It seems to have rebooted, lets see what happens when running it again. Re-run also took about 8 minutes, the duration seems pretty stable.
The code block in question:
julia> m = matrix(ZZ, rand(-10:10, 1000, 1000));
julia> n1 = similar(m); n2 = similar(m); n3 = similar(m);
julia> n1 = mul!(n1, m, m);
julia> n2 = Strassen.mul!(n2, m, m);
julia> n3 = Strassen.mul!(n3, m, m; cutoff = 100);
julia> n1 == n2 == n3
true
This does rely on random numbers and I don't know if this is seeded in any way.
Edit: I have started a run on CI without the Strassen block and with the custom output: https://github.com/Nemocas/AbstractAlgebra.jl/actions/runs/15050984971 Using this commit: https://github.com/Nemocas/AbstractAlgebra.jl/commit/81911b6953ba64a099ed5ef68e97dcfc833120d8
Without the Matrix-Strassen test the doctests took just 1min17s on macOS: https://github.com/Nemocas/AbstractAlgebra.jl/actions/runs/15050984971/job/42305480071#step:7:698
Can we then remove this doctest for the meantime and let @fieker figure out the problem, and we re-enable the doctest once there is a fix?
Some further data: Ubuntu CI (out of 15 min total):
page: src/Matrix-Strassen.jl:15-28
805.782778 seconds (12.88 G allocations: 233.799 GiB, 27.81% gc time, 0.49% compilation time)
Windows CI (out of 21 min total):
page: src\Matrix-Strassen.jl:15-28
1131.379076 seconds (12.88 G allocations: 233.796 GiB, 21.23% gc time, 0.38% compilation time)
macOS CI (out of 47 min total):
page: src/Matrix-Strassen.jl:15-28
2736.860504 seconds (12.88 G allocations: 233.799 GiB, 78.58% gc time, 0.54% compilation time)
macOS does have a lot more GC time, so it might be caused by the runner having less memory available, 7 GB vs 16GB on Linux.
21% of 2737 seconds gives 575 seconds and 72% of 805 seconds gives 580 seconds which does match quite well.
The doctest in question has been disabled in https://github.com/Nemocas/AbstractAlgebra.jl/pull/2085 to reduce the load on the CI runners. @fieker could you please have a look if the main problem is the strassen code or the naive multiplication that's used in the tests? In case of the former, I think you would be interested in trying to find the underlying issue. For the latter case, could you think of something smaller to test that still hits all code branches?