sage-on-gentoo
sage-on-gentoo copied to clipboard
sage/matrix/matrix_integer_dense.pyx doctest sometimes breaks with time out
sage -t --long --random-seed=4867623489143374956615441254140194808 /usr/lib/python3.10/site-packages/sage/matrix/matrix_integer_dense.pyx # Timed out (and interrupt failed)
It doesn't always fail. But it related to using openblas with threads. Switching openblas to use openmp will make the issue go away. It is unclear if switching to another blas also fixes it. It needs to be tested.
A data point. I do see the failure on s-o-g but not so far on vanilla. Vanilla here uses system
openblas [ pthread, -openmp ]
and system singular
. The s-o-g failure
sage: a = matrix(ZZ,2,[1,-7,3,5]) ## line 5597 ##
sage: a._change_ring(RDF) ## line 5598 ##
[ 1.0 -7.0]
[ 3.0 5.0]
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 5601 ##
0
sage: A = matrix(ZZ, 3, 3, [-8, 2, 0, 0, 1, -1, 2, 1, -95]) ## line 5621 ##
sage: As = singular(A); As ## line 5622 ##
Another data point - s-o-g does not have openblas
as a NEEDED
lib.
On vanilla
$ objdump -p src/sage/matrix/matrix_integer_dense.cpython-310-x86_64-linux-gnu.so | grep NEEDED
NEEDED libiml.so.0
NEEDED libgmp.so.10
NEEDED libopenblas.so.0
NEEDED libpari-gmp-tls.so.7
NEEDED libflint.so.16
NEEDED libm.so.6
NEEDED libc.so.6
versus on Gentoo
$ objdump -p /usr/lib/python3.10/site-packages/sage/matrix/matrix_integer_dense.cpython-310-x86_64-linux-gnu.so | grep NEEDED
NEEDED libiml.so.0
NEEDED libpari-gmp-tls.so.7
NEEDED libflint.so.16
NEEDED libgmp.so.10
NEEDED libm.so.6
NEEDED libc.so.6
needed
libs may not be an issue. On my gentoo-prefix I don't see a doctest failure.
It shouldn't be an issue. blas is not used directly, it should be pulled by iml
.
I'm able to get the time out
(/storage/strogdon/gentoo-rap/usr/lib64/libopenblas.so.0(blas_thread_shutdown_+0xbf)[0x7ffb0a90889f]
) on gentoo-prefix when doctesting the folder
sage -tp 9 --long ~/usr/lib/python3.10/site-packages/sage/matrix/
I have not been able to get vanilla to fail when doctesting the above folder.
From src/bin/sage-env
there is
# Multithreading in OpenBLAS does not seem to play well with Sage's attempts to
# spawn new processes, see #26118. Apparently, OpenBLAS sets the thread
# affinity and, e.g., parallel doctest jobs, remain on the same core.
# Disabling that thread-affinity with OPENBLAS_MAIN_FREE=1 leads to hangs in
# some computations.
# So we disable OpenBLAS' threading completely; we might loose some performance
# here but strangely the opposite seems to be the case. Note that callers such
# as LinBox use a single-threaded OpenBLAS anyway.
export OPENBLAS_NUM_THREADS=1
Does this mean that OPENBLAS_NUM_THREADS=1
during doctests? In any event I get non-failing results with
OPENBLAS_NUM_THREADS=1 sage -t --long /usr/lib/python3.10/site-packages/sage/matrix/matrix_integer_dense.pyx
I'm not sure what s-o-g does relative to OPENBLAS_NUM_THREADS
.
I do nothing about it. If we were to add something, it may have to live in sage-runtest. But yes it means the whole of vanilla sage runs basically without threads unless something overrides it. It is a bit misguided to only consider linbox, scipy uses lapack for some stuff and so does iml which is where the issue come from.
Setting OPENBLAS_NUM_THREADS
definitely has an impact here. I will think about what to do about it.