dbcsr
dbcsr copied to clipboard
Two tests fail on macOS PPC: dbcsr_unittest2, dbcsr_unittest3
I am bringing dbcsr
to Macports, where we support all range of macOS including old ones (at least 10.5+).
Two tests fail on 10.6.8 Rosetta (I cannot test native PPC at the moment, away from PPC hardware): dbcsr_unittest2, dbcsr_unittest3.
---> Testing dbcsr
Executing: cd "/opt/local/var/macports/build/_opt_PPCRosettaPorts_math_dbcsr/dbcsr/work/build" && ctest test
Test project /opt/local/var/macports/build/_opt_PPCRosettaPorts_math_dbcsr/dbcsr/work/build
Start 1: dbcsr_perf:inputs/test_H2O.perf
1/19 Test #1: dbcsr_perf:inputs/test_H2O.perf ....................... Passed 581.03 sec
Start 2: dbcsr_perf:inputs/test_rect1_dense.perf
2/19 Test #2: dbcsr_perf:inputs/test_rect1_dense.perf ............... Passed 1.06 sec
Start 3: dbcsr_perf:inputs/test_rect1_sparse.perf
3/19 Test #3: dbcsr_perf:inputs/test_rect1_sparse.perf .............. Passed 10.55 sec
Start 4: dbcsr_perf:inputs/test_rect2_dense.perf
4/19 Test #4: dbcsr_perf:inputs/test_rect2_dense.perf ............... Passed 1.01 sec
Start 5: dbcsr_perf:inputs/test_rect2_sparse.perf
5/19 Test #5: dbcsr_perf:inputs/test_rect2_sparse.perf .............. Passed 8.58 sec
Start 6: dbcsr_perf:inputs/test_singleblock.perf
6/19 Test #6: dbcsr_perf:inputs/test_singleblock.perf ............... Passed 0.74 sec
Start 7: dbcsr_perf:inputs/test_square_dense.perf
7/19 Test #7: dbcsr_perf:inputs/test_square_dense.perf .............. Passed 0.73 sec
Start 8: dbcsr_perf:inputs/test_square_sparse.perf
8/19 Test #8: dbcsr_perf:inputs/test_square_sparse.perf ............. Passed 3.15 sec
Start 9: dbcsr_perf:inputs/test_square_sparse_bigblocks.perf
9/19 Test #9: dbcsr_perf:inputs/test_square_sparse_bigblocks.perf ... Passed 4.10 sec
Start 10: dbcsr_perf:inputs/test_square_sparse_rma.perf
10/19 Test #10: dbcsr_perf:inputs/test_square_sparse_rma.perf ......... Passed 2.90 sec
Start 11: dbcsr_unittest1
11/19 Test #11: dbcsr_unittest1 ....................................... Passed 158.55 sec
Start 12: dbcsr_unittest2
12/19 Test #12: dbcsr_unittest2 .......................................***Failed 0.78 sec
Start 13: dbcsr_unittest3
13/19 Test #13: dbcsr_unittest3 .......................................***Failed 13.93 sec
Start 14: dbcsr_unittest4
14/19 Test #14: dbcsr_unittest4 ....................................... Passed 0.62 sec
Start 15: dbcsr_tensor_unittest
15/19 Test #15: dbcsr_tensor_unittest ................................. Passed 10.30 sec
Start 16: dbcsr_tas_unittest
16/19 Test #16: dbcsr_tas_unittest .................................... Passed 4.37 sec
Start 17: dbcsr_test_csr_conversions
17/19 Test #17: dbcsr_test_csr_conversions ............................ Passed 4.73 sec
Start 18: dbcsr_test
18/19 Test #18: dbcsr_test ............................................ Passed 0.53 sec
Start 19: dbcsr_tensor_test
19/19 Test #19: dbcsr_tensor_test ..................................... Passed 0.84 sec
89% tests passed, 2 tests failed out of 19
@alazzaro Suggestions how to fix that are greatly appreciated.
Environment:
macOS 10.6.8 Rosetta (ppc32
)
gcc 12.2.0
mpich-gcc12 @4.0.2+fortran
cmake-devel 20221130-3.25.1
ninja @1.11.1
OpenBLAS @0.3.21+gcc12+lapack+native
python310 @3.10.9
py-fypp @3.1
Portfile used: https://github.com/macports/macports-ports/blob/6e401b768cff5631fba66cca8ef346600a175c5a/math/dbcsr/Portfile
We never tested such old versions of OSX, so I have no idea what the error can be. Some notes:
- From your log, it seems you are not running under MPI. I can see several repetitions in the log, e.g.
DBCSR| CPU Multiplication driver BLAS (D)
DBCSR| CPU Multiplication driver BLAS (D)
DBCSR| CPU Multiplication driver BLAS (D)
DBCSR| CPU Multiplication driver BLAS (D)
It seems there are 4 simultaneous instances (which is what you are running with "/usr/bin/mpiexec" "-n" "4"
), but then DBCSR reports
DBCSR| MPI: Number of processes 1
so there is something wrong... I assume you should set the cmake flags:
-DMPIEXEC_EXECUTABLE="mpirun" \
-DTEST_MPI_RANKS="1" \
- I assume you are using BLAS for the block multiplications. Actually, on OSX we used to test with the Accelerate framework, so I wonder if it can introduce some issues here... This an example. Could yo confirm which BLAS library is used? You can add:
-DBLAS_FOUND=ON -DBLAS_LIBRARIES="<path>" \
-DLAPACK_FOUND=ON -DLAPACK_LIBRARIES="<path>" \
to make more specific.
@alazzaro I spent quite some time today on this, but I cannot force it to use correct MPI settings for some reason. It still uses:
Command: "/usr/bin/mpiexec" "-n" "4" "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_dbcsr/dbcsr/work/build/tests/dbcsr_perf" "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_dbcsr/dbcsr/work/dbcsr-2.5.0/tests/inputs/test_H2O.perf"
Directory: /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_dbcsr/dbcsr/work/build/tests
"dbcsr_perf:inputs/test_H2O.perf" start time: Jan 12 05:18 WIT
Despite I passed -DMPIEXEC_EXECUTABLE=${prefix}/bin/mpiexec-mpich-gcc12
and -DTEST_MPI_RANKS="1"
to CMake. I tried a variety of ways, no effect whatsoever.
Where does this /usr/bin/mpiexec
even come from?
I will try Accelerate
, but I suspect that on old macOS OpenBLAS
is a better bet.
@alazzaro So, with Accelerate
is seems to work better indeed (at least on 10.6.8, I cannot check on native PPC right now), but now dbcsr_unittest1
times out (was fine before with OpenBLAS
):
---> Testing dbcsr
Executing: cd "/opt/local/var/macports/build/_opt_PPCRosettaPorts_math_dbcsr/dbcsr/work/build" && ctest test
Test project /opt/local/var/macports/build/_opt_PPCRosettaPorts_math_dbcsr/dbcsr/work/build
Start 1: dbcsr_perf:inputs/test_H2O.perf
1/19 Test #1: dbcsr_perf:inputs/test_H2O.perf ....................... Passed 802.29 sec
Start 2: dbcsr_perf:inputs/test_rect1_dense.perf
2/19 Test #2: dbcsr_perf:inputs/test_rect1_dense.perf ............... Passed 7.54 sec
Start 3: dbcsr_perf:inputs/test_rect1_sparse.perf
3/19 Test #3: dbcsr_perf:inputs/test_rect1_sparse.perf .............. Passed 35.68 sec
Start 4: dbcsr_perf:inputs/test_rect2_dense.perf
4/19 Test #4: dbcsr_perf:inputs/test_rect2_dense.perf ............... Passed 6.45 sec
Start 5: dbcsr_perf:inputs/test_rect2_sparse.perf
5/19 Test #5: dbcsr_perf:inputs/test_rect2_sparse.perf .............. Passed 34.73 sec
Start 6: dbcsr_perf:inputs/test_singleblock.perf
6/19 Test #6: dbcsr_perf:inputs/test_singleblock.perf ............... Passed 1.20 sec
Start 7: dbcsr_perf:inputs/test_square_dense.perf
7/19 Test #7: dbcsr_perf:inputs/test_square_dense.perf .............. Passed 2.05 sec
Start 8: dbcsr_perf:inputs/test_square_sparse.perf
8/19 Test #8: dbcsr_perf:inputs/test_square_sparse.perf ............. Passed 10.71 sec
Start 9: dbcsr_perf:inputs/test_square_sparse_bigblocks.perf
9/19 Test #9: dbcsr_perf:inputs/test_square_sparse_bigblocks.perf ... Passed 5.60 sec
Start 10: dbcsr_perf:inputs/test_square_sparse_rma.perf
10/19 Test #10: dbcsr_perf:inputs/test_square_sparse_rma.perf ......... Passed 9.74 sec
Start 11: dbcsr_unittest1
sh: /bin/ps: Operation not permitted
11/19 Test #11: dbcsr_unittest1 .......................................***Timeout 1499.97 sec
Start 12: dbcsr_unittest2
12/19 Test #12: dbcsr_unittest2 ....................................... Passed 370.93 sec
Start 13: dbcsr_unittest3
13/19 Test #13: dbcsr_unittest3 ....................................... Passed 153.45 sec
Start 14: dbcsr_unittest4
14/19 Test #14: dbcsr_unittest4 ....................................... Passed 1.60 sec
Start 15: dbcsr_tensor_unittest
15/19 Test #15: dbcsr_tensor_unittest ................................. Passed 19.65 sec
Start 16: dbcsr_tas_unittest
16/19 Test #16: dbcsr_tas_unittest .................................... Passed 9.14 sec
Start 17: dbcsr_test_csr_conversions
17/19 Test #17: dbcsr_test_csr_conversions ............................ Passed 23.92 sec
Start 18: dbcsr_test
18/19 Test #18: dbcsr_test ............................................ Passed 1.34 sec
Start 19: dbcsr_tensor_test
19/19 Test #19: dbcsr_tensor_test ..................................... Passed 1.97 sec
95% tests passed, 1 tests failed out of 19
Total Test time (real) = 2998.21 sec
The following tests FAILED:
11 - dbcsr_unittest1 (Timeout)
I have used this args:
configure.args-append \
-DBLAS_FOUND=ON \
-DBLAS_LIBRARIES=/usr/lib/libblas.dylib \
-DLAPACK_FOUND=ON \
-DLAPACK_LIBRARIES=/usr/lib/libLAPACK.dylib
if {[string match *gcc* ${configure.compiler}]} {
configure.cflags-append \
-flax-vector-conversions
}
Complete log from tests: tests_log_with_Accelerate.txt
This was my suspicious, on OSX we allow assume that Accelerate is used... I will try to install a vagrant machine with OSX and try to fix this problem when people make Openblas available.
This was my suspicious, on OSX we allow assume that Accelerate is used... I will try to install a vagrant machine with OSX and try to fix this problem when people make Openblas available.
Thank you very much!
P. S. By the way, why dbcsr_unittest1
times out now?
This was my suspicious, on OSX we allow assume that Accelerate is used... I will try to install a vagrant machine with OSX and try to fix this problem when people make Openblas available.
Thank you very much!
P. S. By the way, why
dbcsr_unittest1
times out now?
I can assume Accelerate is not really optimized, no sure though...
I can assume Accelerate is not really optimized, no sure though...
Well, it is old, and on earlier systems will be worse, perhaps (and cannot be updated, being a system component).
If building with OpenBLAS
is fixed, that would be great.
@alazzaro I actually do not see what is wrong there: everything passes in dbcsr_unittest1
, but then it reports failure:
**********************************************************************
-- TESTING dbcsr_multiply (T, N, 7 , S, S, N) ............... PASSED !
**********************************************************************
<end of output>
Test time = 1499.97 sec
----------------------------------------------------------
Test Failed.
"dbcsr_unittest1" end time: Jan 13 00:52 WIT
"dbcsr_unittest1" time elapsed: 00:24:59
Not all tests were run due to time limit being set?
P. S. This is determined not to cause test errors, but -DMPIEXEC_EXECUTABLE=
keeps being ignored, and ancient version from system prefix is used instead of the one passed to CMake.
So, could you run the tests via
env CTEST_OUTPUT_ON_FAILURE=1 make test ARGS="--timeout 2000"
? I'm guessing, I really need to reproduce it on my side...
To detect MPI we are using FindMPI from CMake.
According to the documentation (and from what I remember to have tested), setting MPIEXEC_EXECUTABLE
should be the correct way to override the MPI detection.
In my case I have both MPICH and OpenMPI installed via Homebrew and MPICH is the default (linked):
dbcsr/build.default-mpi on develop [$] ❯ cmake ..
[...]
-- Found MPI_C: /usr/local/Cellar/mpich/4.0.3/lib/libmpi.dylib (found version "4.0")
-- Found MPI_CXX: /usr/local/Cellar/mpich/4.0.3/lib/libmpicxx.dylib (found version "4.0")
-- Found MPI_Fortran: /usr/local/Cellar/mpich/4.0.3/lib/libmpifort.dylib (found version "4.0")
-- Found MPI: TRUE (found version "4.0") found components: C CXX Fortran
-- Setting build type to 'Release' as none was specified.
-- Performing Test f2008-norm2
-- Performing Test f2008-norm2 - Success
-- Performing Test f2008-block_construct
-- Performing Test f2008-block_construct - Success
-- Performing Test f2008-contiguous
-- Performing Test f2008-contiguous - Success
-- Performing Test f95-reshape-order-allocatable
-- Performing Test f95-reshape-order-allocatable - Success
-- FYPP preprocessor found.
Tests will run with 8 MPI ranks and 2 OpenMP threads each
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/tiziano/Projects/cp2k/dbcsr/build.default-mpi
Running in a different/fresh build directory:
dbcsr/build.openmpi on develop [$?] ❯ cmake -DMPIEXEC_EXECUTABLE=/usr/local/Cellar/open-mpi/4.1.4_2/bin/mpiexec -DTEST_MPI_RANKS=1 ..
[...]
-- Found MPI_C: /usr/local/Cellar/open-mpi/4.1.4_2/lib/libmpi.dylib (found version "3.1")
-- Found MPI_CXX: /usr/local/Cellar/open-mpi/4.1.4_2/lib/libmpi.dylib (found version "3.1")
-- Found MPI_Fortran: /usr/local/Cellar/open-mpi/4.1.4_2/lib/libmpi_usempif08.dylib (found version "3.1")
-- Found MPI: TRUE (found version "3.1") found components: C CXX Fortran
-- Setting build type to 'Release' as none was specified.
-- Performing Test f2008-norm2
-- Performing Test f2008-norm2 - Success
-- Performing Test f2008-block_construct
-- Performing Test f2008-block_construct - Success
-- Performing Test f2008-contiguous
-- Performing Test f2008-contiguous - Success
-- Performing Test f95-reshape-order-allocatable
-- Performing Test f95-reshape-order-allocatable - Success
-- FYPP preprocessor found.
Tests will run with 1 MPI ranks and 2 OpenMP threads each
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/tiziano/Projects/cp2k/dbcsr/build.openmpi
So, at least here it seems that CMake is following the instructions correctly.
I assume you are aware of this, but for many variables (like MPIEXEC_EXECUTABLE
) you have to pass --fresh
to reconfigure the full tree. TEST_MPI_RANKS
gets picked up on a simple reconfigure, though.
For future reference, can you please also post the complete CMake configure log? @barracuda156
@dev-zero Ok, with TEST_MPI_RANKS
the problem was quote marks, haha. -DTEST_MPI_RANKS=2
work fine.
I will update soon on the tests, let me try something.
@alazzaro Until OpenBLAS
linking on Apple is fixed, maybe worth adding a note in docs, like you have it for Power9?
https://cp2k.github.io/dbcsr/develop/page/2-user-guide/1-installation/index.html
Or is it actually PowerPC-specific issue and not Apple-specific?
I have figured out finally how to force Macports mpich
being used in tests, but it seems to have broken down everything: the very first test fails with timeout:
Test project /opt/local/var/macports/build/_opt_PPCRosettaPorts_math_dbcsr/dbcsr/work/build
Start 1: dbcsr_perf:inputs/test_H2O.perf
1/19 Test #1: dbcsr_perf:inputs/test_H2O.perf .......................***Timeout 1500.03 sec
This is just for the record. I will look for a combination of settings that works optimally.
@barracuda156 the timeout with MPICH in test_H2O.perf
on macOS is something we are aware of, but didn't have time to track this down yet. Linking to OpenBLAS on macOS is doable, it's just that by default CMake finds Accelerate first until told to look elsewhere:
cmake -DCMAKE_PREFIX_PATH="/usr/local/opt/openblas" ..
So far I've been able to successfully run all the tests on macOS with OpenBLAS+OpenMPI. Github Action runners fail, though.
@dev-zero There is no problem to link to OpenBLAS
, in fact I have done that originally. The problem is that two tests fail, at least on 10.6.8 ppc32 (which was the reason for this ticket in the first place). We can use different linear algebra libs on different systems and/or archs, but that is harder to maintain.
the timeout with MPICH in test_H2O.perf on macOS is something we are aware of, but didn't have time to track this down yet
Hopefully that can be fixed. While on 10.6 old system mpi
works better than the new mpich
, I am not sure it gonna work on 10.5, which we also want to support.
I vaguely recall OpenMPI
is kinda broken either on PPC or on old MacOS, but need to try testing it with this port.
Build and test log (multiple builds): misc_build_test.txt
Conclusions for now:
- Use either
Accelerate
directly orvecLibFort
(supposed to be needed if BLAS is called from Fortran); apparently either works.OpenBLAS
does not, failing badly with wrong results. - Running tests with
mpich
4.0.3 is broken. Systemmpi
kinda works, only one test times out.
P. S. sh: /bin/ps: Operation not permitted
error is solved like this (adding it to portfile):
pre-test {
# test infrastructure uses /bin/ps, which is forbidden by sandboxing
append portsandbox_profile " (allow process-exec (literal \"/bin/ps\") (with no-profile))"
}
You may consider fixing it in the source code though.
So, could you run the tests via
env CTEST_OUTPUT_ON_FAILURE=1 make test ARGS="--timeout 2000"
? I'm guessing, I really need to reproduce it on my side...
@alazzaro How to pass it to ctest
? This does not have any effect: ctest test --timeout=2000
. (With make
it does not work, complaining about missing target for test
.)
@alazzaro @dev-zero Interesting that when linking to OpenBLAS
, the first unit test passes quickly, while unit tests 2 and 3 fail badly with wrong results:
Start 11: dbcsr_unittest1
11/19 Test #11: dbcsr_unittest1 ....................................... Passed 158.55 sec
Start 12: dbcsr_unittest2
12/19 Test #12: dbcsr_unittest2 .......................................***Failed 0.78 sec
Start 13: dbcsr_unittest3
13/19 Test #13: dbcsr_unittest3 .......................................***Failed 13.93 sec
When linking to vecLibFort
or Accelerate
, unit tests 2 and 3 pass, but test 1 times out (takes more than 10 times longer!), though no wrong results.
I have noticed that -D__ACCELERATE
is not picked automatically – neither with Accelerate nor with vecLibFort. Made a patch to src/CMakeLists now and testing again.
UPD. No difference to test results whatsoever. Whether this flag does smth or not, its absence was not causing the problem.
Other tests run at full load on every core, dbcsr_unittest1
is barely doing something:
Likely the reason behind why it takes 10 times more time (that when run under OpenBLAS) and times out eventually.
You may consider fixing it in the source code though.
AFAIK we're not calling ps
directly, therefore this must originate from either MPI or ctest and therefore unlikely to get fixed by us. Also, I would consider ps
a rather essential tool to which a test system should have access.
I have noticed that
-D__ACCELERATE
is not picked automatically – neither with Accelerate nor with vecLibFort. Made a patch to src/CMakeLists now and testing again.UPD. No difference to test results whatsoever. Whether this flag does smth or not, its absence was not causing the problem.
That's not good. macOS' accelerate has a slightly different LAPACK API (returning double precision for single precision calls). Not defining can therefore lead to wrong results.
We have this:
if (APPLE)
# fix /proc/self/statm can not be opened on macOS
target_compile_definitions(dbcsr PRIVATE __NO_STATM_ACCESS)
if (BLAS_LIBRARIES MATCHES "Accelerate")
target_compile_definitions(dbcsr PRIVATE __ACCELERATE)
endif ()
endif ()
Which doesn't get triggered since you're setting BLAS_LIBRARIES
to a generic path (as per the logs you supplied), rather than the usual /Library/Developer/CommandLineTools/SDKs/MacOSX13.1.sdk/System/Library/Frameworks/Accelerate.framework
I see in my logs.
I wonder, maybe we should instead check against BLA_VENDOR
instead.
AFAIK we're not calling
ps
directly, therefore this must originate from either MPI or ctest and therefore unlikely to get fixed by us. Also, I would considerps
a rather essential tool to which a test system should have access.
Got it. We added a fix in Macports. FWIU, it is a Mac sandboxing issue.
That's not good. macOS' accelerate has a slightly different LAPACK API (returning double precision for single precision calls). Not defining can therefore lead to wrong results. Which doesn't get triggered since you're setting
BLAS_LIBRARIES
to a generic path (as per the logs you supplied), rather than the usual/Library/Developer/CommandLineTools/SDKs/MacOSX13.1.sdk/System/Library/Frameworks/Accelerate.framework
I see in my logs. I wonder, maybe we should instead check againstBLA_VENDOR
instead.
@dev-zero Macports standard handling of this is via linear_algebra
PG: https://github.com/macports/macports-ports/blob/92f50e73e1ae04fc6ced3ddec56e088edf99ce86/_resources/port1.0/group/linear_algebra-1.0.tcl#L63
So it looks like BLA_VENDOR
should work (and it is a common definition, not Macports-specific).
It is perhaps desirable to expand the check to vecLibFort
too: https://github.com/mcg1969/vecLibFort – which is used instead of Accelerate
directly when Fortran support is needed.
P. S. BLAS and LAPACK dylibs in /usr/lib
are symlinks to Accelerate
. But specific path to Accelerate
differs depending on OS version.
It is perhaps desirable to expand the check to
vecLibFort
too: https://github.com/mcg1969/vecLibFort – which is used instead ofAccelerate
directly when Fortran support is needed.
According to the documentation this is exactly what vecLibFort
is supposed to be fixing.
I wonder whether we shouldn't just start bundling vecLibFort
as recommended by the project and then we can forget about our __ACCELERATE
guards. @alazzaro what do you think? This might have to be synchronized with CP2K.
According to the documentation this is exactly what
vecLibFort
is supposed to be fixing. I wonder whether we shouldn't just start bundlingvecLibFort
as recommended by the project and then we can forget about our__ACCELERATE
guards. @alazzaro what do you think? This might have to be synchronized with CP2K.
@dev-zero Macports defaults to vecLibFort
unless the port explicitly asks not to use it. Do I get it right that we do not need __ACCELERATE
macros then at all? (If they are not needed but do not cause issues with vecLibFort
, I will keep them for now, just to avoid making another PR to Macports in a row for the same port.)
If you decide to use vecLibFort
directly, please consider allowing an external one as well. Many users will have it installed already, whether in Macports or otherwise.
@dev-zero @alazzaro I do not close the issue, since above discussion may be relevant for further improvements (fixing OpenBLAS on Mac, using vecLibFort
), but I got all tests passing on PPC:
---> Testing dbcsr
Executing: cd "/opt/local/var/macports/build/_opt_PPCRosettaPorts_math_dbcsr/dbcsr/work/build" && ctest test
Test project /opt/local/var/macports/build/_opt_PPCRosettaPorts_math_dbcsr/dbcsr/work/build
Start 1: dbcsr_perf:inputs/test_H2O.perf
1/19 Test #1: dbcsr_perf:inputs/test_H2O.perf ....................... Passed 329.59 sec
Start 2: dbcsr_perf:inputs/test_rect1_dense.perf
2/19 Test #2: dbcsr_perf:inputs/test_rect1_dense.perf ............... Passed 2.69 sec
Start 3: dbcsr_perf:inputs/test_rect1_sparse.perf
3/19 Test #3: dbcsr_perf:inputs/test_rect1_sparse.perf .............. Passed 12.66 sec
Start 4: dbcsr_perf:inputs/test_rect2_dense.perf
4/19 Test #4: dbcsr_perf:inputs/test_rect2_dense.perf ............... Passed 2.35 sec
Start 5: dbcsr_perf:inputs/test_rect2_sparse.perf
5/19 Test #5: dbcsr_perf:inputs/test_rect2_sparse.perf .............. Passed 9.97 sec
Start 6: dbcsr_perf:inputs/test_singleblock.perf
6/19 Test #6: dbcsr_perf:inputs/test_singleblock.perf ............... Passed 0.71 sec
Start 7: dbcsr_perf:inputs/test_square_dense.perf
7/19 Test #7: dbcsr_perf:inputs/test_square_dense.perf .............. Passed 0.94 sec
Start 8: dbcsr_perf:inputs/test_square_sparse.perf
8/19 Test #8: dbcsr_perf:inputs/test_square_sparse.perf ............. Passed 3.30 sec
Start 9: dbcsr_perf:inputs/test_square_sparse_bigblocks.perf
9/19 Test #9: dbcsr_perf:inputs/test_square_sparse_bigblocks.perf ... Passed 2.78 sec
Start 10: dbcsr_perf:inputs/test_square_sparse_rma.perf
10/19 Test #10: dbcsr_perf:inputs/test_square_sparse_rma.perf ......... Passed 3.53 sec
Start 11: dbcsr_unittest1
11/19 Test #11: dbcsr_unittest1 ....................................... Passed 1235.95 sec
Start 12: dbcsr_unittest2
12/19 Test #12: dbcsr_unittest2 ....................................... Passed 333.61 sec
Start 13: dbcsr_unittest3
13/19 Test #13: dbcsr_unittest3 ....................................... Passed 191.42 sec
Start 14: dbcsr_unittest4
14/19 Test #14: dbcsr_unittest4 ....................................... Passed 1.14 sec
Start 15: dbcsr_tensor_unittest
15/19 Test #15: dbcsr_tensor_unittest ................................. Passed 11.51 sec
Start 16: dbcsr_tas_unittest
16/19 Test #16: dbcsr_tas_unittest .................................... Passed 9.60 sec
Start 17: dbcsr_test_csr_conversions
17/19 Test #17: dbcsr_test_csr_conversions ............................ Passed 22.04 sec
Start 18: dbcsr_test
18/19 Test #18: dbcsr_test ............................................ Passed 0.53 sec
Start 19: dbcsr_tensor_test
19/19 Test #19: dbcsr_tensor_test ..................................... Passed 1.10 sec
100% tests passed, 0 tests failed out of 19
Total Test time (real) = 2175.63 sec
Used MPI_RANKS=2, prevented time-out.