dbcsr
dbcsr copied to clipboard
Two tests fail on macOS PPC: dbcsr_unittest2, dbcsr_unittest3
I am bringing dbcsr to Macports, where we support all range of macOS including old ones (at least 10.5+).
Two tests fail on 10.6.8 Rosetta (I cannot test native PPC at the moment, away from PPC hardware): dbcsr_unittest2, dbcsr_unittest3.
---> Testing dbcsr
Executing: cd "/opt/local/var/macports/build/_opt_PPCRosettaPorts_math_dbcsr/dbcsr/work/build" && ctest test
Test project /opt/local/var/macports/build/_opt_PPCRosettaPorts_math_dbcsr/dbcsr/work/build
Start 1: dbcsr_perf:inputs/test_H2O.perf
1/19 Test #1: dbcsr_perf:inputs/test_H2O.perf ....................... Passed 581.03 sec
Start 2: dbcsr_perf:inputs/test_rect1_dense.perf
2/19 Test #2: dbcsr_perf:inputs/test_rect1_dense.perf ............... Passed 1.06 sec
Start 3: dbcsr_perf:inputs/test_rect1_sparse.perf
3/19 Test #3: dbcsr_perf:inputs/test_rect1_sparse.perf .............. Passed 10.55 sec
Start 4: dbcsr_perf:inputs/test_rect2_dense.perf
4/19 Test #4: dbcsr_perf:inputs/test_rect2_dense.perf ............... Passed 1.01 sec
Start 5: dbcsr_perf:inputs/test_rect2_sparse.perf
5/19 Test #5: dbcsr_perf:inputs/test_rect2_sparse.perf .............. Passed 8.58 sec
Start 6: dbcsr_perf:inputs/test_singleblock.perf
6/19 Test #6: dbcsr_perf:inputs/test_singleblock.perf ............... Passed 0.74 sec
Start 7: dbcsr_perf:inputs/test_square_dense.perf
7/19 Test #7: dbcsr_perf:inputs/test_square_dense.perf .............. Passed 0.73 sec
Start 8: dbcsr_perf:inputs/test_square_sparse.perf
8/19 Test #8: dbcsr_perf:inputs/test_square_sparse.perf ............. Passed 3.15 sec
Start 9: dbcsr_perf:inputs/test_square_sparse_bigblocks.perf
9/19 Test #9: dbcsr_perf:inputs/test_square_sparse_bigblocks.perf ... Passed 4.10 sec
Start 10: dbcsr_perf:inputs/test_square_sparse_rma.perf
10/19 Test #10: dbcsr_perf:inputs/test_square_sparse_rma.perf ......... Passed 2.90 sec
Start 11: dbcsr_unittest1
11/19 Test #11: dbcsr_unittest1 ....................................... Passed 158.55 sec
Start 12: dbcsr_unittest2
12/19 Test #12: dbcsr_unittest2 .......................................***Failed 0.78 sec
Start 13: dbcsr_unittest3
13/19 Test #13: dbcsr_unittest3 .......................................***Failed 13.93 sec
Start 14: dbcsr_unittest4
14/19 Test #14: dbcsr_unittest4 ....................................... Passed 0.62 sec
Start 15: dbcsr_tensor_unittest
15/19 Test #15: dbcsr_tensor_unittest ................................. Passed 10.30 sec
Start 16: dbcsr_tas_unittest
16/19 Test #16: dbcsr_tas_unittest .................................... Passed 4.37 sec
Start 17: dbcsr_test_csr_conversions
17/19 Test #17: dbcsr_test_csr_conversions ............................ Passed 4.73 sec
Start 18: dbcsr_test
18/19 Test #18: dbcsr_test ............................................ Passed 0.53 sec
Start 19: dbcsr_tensor_test
19/19 Test #19: dbcsr_tensor_test ..................................... Passed 0.84 sec
89% tests passed, 2 tests failed out of 19
@alazzaro Suggestions how to fix that are greatly appreciated.
Environment:
macOS 10.6.8 Rosetta (ppc32)
gcc 12.2.0
mpich-gcc12 @4.0.2+fortran
cmake-devel 20221130-3.25.1
ninja @1.11.1
OpenBLAS @0.3.21+gcc12+lapack+native
python310 @3.10.9
py-fypp @3.1
Portfile used: https://github.com/macports/macports-ports/blob/6e401b768cff5631fba66cca8ef346600a175c5a/math/dbcsr/Portfile
We never tested such old versions of OSX, so I have no idea what the error can be. Some notes:
- From your log, it seems you are not running under MPI. I can see several repetitions in the log, e.g.
DBCSR| CPU Multiplication driver BLAS (D)
DBCSR| CPU Multiplication driver BLAS (D)
DBCSR| CPU Multiplication driver BLAS (D)
DBCSR| CPU Multiplication driver BLAS (D)
It seems there are 4 simultaneous instances (which is what you are running with "/usr/bin/mpiexec" "-n" "4"), but then DBCSR reports
DBCSR| MPI: Number of processes 1
so there is something wrong... I assume you should set the cmake flags:
-DMPIEXEC_EXECUTABLE="mpirun" \
-DTEST_MPI_RANKS="1" \
- I assume you are using BLAS for the block multiplications. Actually, on OSX we used to test with the Accelerate framework, so I wonder if it can introduce some issues here... This an example. Could yo confirm which BLAS library is used? You can add:
-DBLAS_FOUND=ON -DBLAS_LIBRARIES="<path>" \
-DLAPACK_FOUND=ON -DLAPACK_LIBRARIES="<path>" \
to make more specific.
@alazzaro I spent quite some time today on this, but I cannot force it to use correct MPI settings for some reason. It still uses:
Command: "/usr/bin/mpiexec" "-n" "4" "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_dbcsr/dbcsr/work/build/tests/dbcsr_perf" "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_dbcsr/dbcsr/work/dbcsr-2.5.0/tests/inputs/test_H2O.perf"
Directory: /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_dbcsr/dbcsr/work/build/tests
"dbcsr_perf:inputs/test_H2O.perf" start time: Jan 12 05:18 WIT
Despite I passed -DMPIEXEC_EXECUTABLE=${prefix}/bin/mpiexec-mpich-gcc12 and -DTEST_MPI_RANKS="1" to CMake. I tried a variety of ways, no effect whatsoever.
Where does this /usr/bin/mpiexec even come from?
I will try Accelerate, but I suspect that on old macOS OpenBLAS is a better bet.
@alazzaro So, with Accelerate is seems to work better indeed (at least on 10.6.8, I cannot check on native PPC right now), but now dbcsr_unittest1 times out (was fine before with OpenBLAS):
---> Testing dbcsr
Executing: cd "/opt/local/var/macports/build/_opt_PPCRosettaPorts_math_dbcsr/dbcsr/work/build" && ctest test
Test project /opt/local/var/macports/build/_opt_PPCRosettaPorts_math_dbcsr/dbcsr/work/build
Start 1: dbcsr_perf:inputs/test_H2O.perf
1/19 Test #1: dbcsr_perf:inputs/test_H2O.perf ....................... Passed 802.29 sec
Start 2: dbcsr_perf:inputs/test_rect1_dense.perf
2/19 Test #2: dbcsr_perf:inputs/test_rect1_dense.perf ............... Passed 7.54 sec
Start 3: dbcsr_perf:inputs/test_rect1_sparse.perf
3/19 Test #3: dbcsr_perf:inputs/test_rect1_sparse.perf .............. Passed 35.68 sec
Start 4: dbcsr_perf:inputs/test_rect2_dense.perf
4/19 Test #4: dbcsr_perf:inputs/test_rect2_dense.perf ............... Passed 6.45 sec
Start 5: dbcsr_perf:inputs/test_rect2_sparse.perf
5/19 Test #5: dbcsr_perf:inputs/test_rect2_sparse.perf .............. Passed 34.73 sec
Start 6: dbcsr_perf:inputs/test_singleblock.perf
6/19 Test #6: dbcsr_perf:inputs/test_singleblock.perf ............... Passed 1.20 sec
Start 7: dbcsr_perf:inputs/test_square_dense.perf
7/19 Test #7: dbcsr_perf:inputs/test_square_dense.perf .............. Passed 2.05 sec
Start 8: dbcsr_perf:inputs/test_square_sparse.perf
8/19 Test #8: dbcsr_perf:inputs/test_square_sparse.perf ............. Passed 10.71 sec
Start 9: dbcsr_perf:inputs/test_square_sparse_bigblocks.perf
9/19 Test #9: dbcsr_perf:inputs/test_square_sparse_bigblocks.perf ... Passed 5.60 sec
Start 10: dbcsr_perf:inputs/test_square_sparse_rma.perf
10/19 Test #10: dbcsr_perf:inputs/test_square_sparse_rma.perf ......... Passed 9.74 sec
Start 11: dbcsr_unittest1
sh: /bin/ps: Operation not permitted
11/19 Test #11: dbcsr_unittest1 .......................................***Timeout 1499.97 sec
Start 12: dbcsr_unittest2
12/19 Test #12: dbcsr_unittest2 ....................................... Passed 370.93 sec
Start 13: dbcsr_unittest3
13/19 Test #13: dbcsr_unittest3 ....................................... Passed 153.45 sec
Start 14: dbcsr_unittest4
14/19 Test #14: dbcsr_unittest4 ....................................... Passed 1.60 sec
Start 15: dbcsr_tensor_unittest
15/19 Test #15: dbcsr_tensor_unittest ................................. Passed 19.65 sec
Start 16: dbcsr_tas_unittest
16/19 Test #16: dbcsr_tas_unittest .................................... Passed 9.14 sec
Start 17: dbcsr_test_csr_conversions
17/19 Test #17: dbcsr_test_csr_conversions ............................ Passed 23.92 sec
Start 18: dbcsr_test
18/19 Test #18: dbcsr_test ............................................ Passed 1.34 sec
Start 19: dbcsr_tensor_test
19/19 Test #19: dbcsr_tensor_test ..................................... Passed 1.97 sec
95% tests passed, 1 tests failed out of 19
Total Test time (real) = 2998.21 sec
The following tests FAILED:
11 - dbcsr_unittest1 (Timeout)
I have used this args:
configure.args-append \
-DBLAS_FOUND=ON \
-DBLAS_LIBRARIES=/usr/lib/libblas.dylib \
-DLAPACK_FOUND=ON \
-DLAPACK_LIBRARIES=/usr/lib/libLAPACK.dylib
if {[string match *gcc* ${configure.compiler}]} {
configure.cflags-append \
-flax-vector-conversions
}
Complete log from tests: tests_log_with_Accelerate.txt
This was my suspicious, on OSX we allow assume that Accelerate is used... I will try to install a vagrant machine with OSX and try to fix this problem when people make Openblas available.
This was my suspicious, on OSX we allow assume that Accelerate is used... I will try to install a vagrant machine with OSX and try to fix this problem when people make Openblas available.
Thank you very much!
P. S. By the way, why dbcsr_unittest1 times out now?
This was my suspicious, on OSX we allow assume that Accelerate is used... I will try to install a vagrant machine with OSX and try to fix this problem when people make Openblas available.
Thank you very much!
P. S. By the way, why
dbcsr_unittest1times out now?
I can assume Accelerate is not really optimized, no sure though...
I can assume Accelerate is not really optimized, no sure though...
Well, it is old, and on earlier systems will be worse, perhaps (and cannot be updated, being a system component).
If building with OpenBLAS is fixed, that would be great.
@alazzaro I actually do not see what is wrong there: everything passes in dbcsr_unittest1, but then it reports failure:
**********************************************************************
-- TESTING dbcsr_multiply (T, N, 7 , S, S, N) ............... PASSED !
**********************************************************************
<end of output>
Test time = 1499.97 sec
----------------------------------------------------------
Test Failed.
"dbcsr_unittest1" end time: Jan 13 00:52 WIT
"dbcsr_unittest1" time elapsed: 00:24:59
Not all tests were run due to time limit being set?
P. S. This is determined not to cause test errors, but -DMPIEXEC_EXECUTABLE= keeps being ignored, and ancient version from system prefix is used instead of the one passed to CMake.
So, could you run the tests via
env CTEST_OUTPUT_ON_FAILURE=1 make test ARGS="--timeout 2000"
? I'm guessing, I really need to reproduce it on my side...
To detect MPI we are using FindMPI from CMake.
According to the documentation (and from what I remember to have tested), setting MPIEXEC_EXECUTABLE should be the correct way to override the MPI detection.
In my case I have both MPICH and OpenMPI installed via Homebrew and MPICH is the default (linked):
dbcsr/build.default-mpi on develop [$] ❯ cmake ..
[...]
-- Found MPI_C: /usr/local/Cellar/mpich/4.0.3/lib/libmpi.dylib (found version "4.0")
-- Found MPI_CXX: /usr/local/Cellar/mpich/4.0.3/lib/libmpicxx.dylib (found version "4.0")
-- Found MPI_Fortran: /usr/local/Cellar/mpich/4.0.3/lib/libmpifort.dylib (found version "4.0")
-- Found MPI: TRUE (found version "4.0") found components: C CXX Fortran
-- Setting build type to 'Release' as none was specified.
-- Performing Test f2008-norm2
-- Performing Test f2008-norm2 - Success
-- Performing Test f2008-block_construct
-- Performing Test f2008-block_construct - Success
-- Performing Test f2008-contiguous
-- Performing Test f2008-contiguous - Success
-- Performing Test f95-reshape-order-allocatable
-- Performing Test f95-reshape-order-allocatable - Success
-- FYPP preprocessor found.
Tests will run with 8 MPI ranks and 2 OpenMP threads each
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/tiziano/Projects/cp2k/dbcsr/build.default-mpi
Running in a different/fresh build directory:
dbcsr/build.openmpi on develop [$?] ❯ cmake -DMPIEXEC_EXECUTABLE=/usr/local/Cellar/open-mpi/4.1.4_2/bin/mpiexec -DTEST_MPI_RANKS=1 ..
[...]
-- Found MPI_C: /usr/local/Cellar/open-mpi/4.1.4_2/lib/libmpi.dylib (found version "3.1")
-- Found MPI_CXX: /usr/local/Cellar/open-mpi/4.1.4_2/lib/libmpi.dylib (found version "3.1")
-- Found MPI_Fortran: /usr/local/Cellar/open-mpi/4.1.4_2/lib/libmpi_usempif08.dylib (found version "3.1")
-- Found MPI: TRUE (found version "3.1") found components: C CXX Fortran
-- Setting build type to 'Release' as none was specified.
-- Performing Test f2008-norm2
-- Performing Test f2008-norm2 - Success
-- Performing Test f2008-block_construct
-- Performing Test f2008-block_construct - Success
-- Performing Test f2008-contiguous
-- Performing Test f2008-contiguous - Success
-- Performing Test f95-reshape-order-allocatable
-- Performing Test f95-reshape-order-allocatable - Success
-- FYPP preprocessor found.
Tests will run with 1 MPI ranks and 2 OpenMP threads each
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/tiziano/Projects/cp2k/dbcsr/build.openmpi
So, at least here it seems that CMake is following the instructions correctly.
I assume you are aware of this, but for many variables (like MPIEXEC_EXECUTABLE) you have to pass --fresh to reconfigure the full tree. TEST_MPI_RANKS gets picked up on a simple reconfigure, though.
For future reference, can you please also post the complete CMake configure log? @barracuda156
@dev-zero Ok, with TEST_MPI_RANKS the problem was quote marks, haha. -DTEST_MPI_RANKS=2 work fine.
I will update soon on the tests, let me try something.
@alazzaro Until OpenBLAS linking on Apple is fixed, maybe worth adding a note in docs, like you have it for Power9?
https://cp2k.github.io/dbcsr/develop/page/2-user-guide/1-installation/index.html
Or is it actually PowerPC-specific issue and not Apple-specific?
I have figured out finally how to force Macports mpich being used in tests, but it seems to have broken down everything: the very first test fails with timeout:
Test project /opt/local/var/macports/build/_opt_PPCRosettaPorts_math_dbcsr/dbcsr/work/build
Start 1: dbcsr_perf:inputs/test_H2O.perf
1/19 Test #1: dbcsr_perf:inputs/test_H2O.perf .......................***Timeout 1500.03 sec
This is just for the record. I will look for a combination of settings that works optimally.
@barracuda156 the timeout with MPICH in test_H2O.perf on macOS is something we are aware of, but didn't have time to track this down yet. Linking to OpenBLAS on macOS is doable, it's just that by default CMake finds Accelerate first until told to look elsewhere:
cmake -DCMAKE_PREFIX_PATH="/usr/local/opt/openblas" ..
So far I've been able to successfully run all the tests on macOS with OpenBLAS+OpenMPI. Github Action runners fail, though.
@dev-zero There is no problem to link to OpenBLAS, in fact I have done that originally. The problem is that two tests fail, at least on 10.6.8 ppc32 (which was the reason for this ticket in the first place). We can use different linear algebra libs on different systems and/or archs, but that is harder to maintain.
the timeout with MPICH in test_H2O.perf on macOS is something we are aware of, but didn't have time to track this down yet
Hopefully that can be fixed. While on 10.6 old system mpi works better than the new mpich, I am not sure it gonna work on 10.5, which we also want to support.
I vaguely recall OpenMPI is kinda broken either on PPC or on old MacOS, but need to try testing it with this port.
Build and test log (multiple builds): misc_build_test.txt
Conclusions for now:
- Use either
Acceleratedirectly orvecLibFort(supposed to be needed if BLAS is called from Fortran); apparently either works.OpenBLASdoes not, failing badly with wrong results. - Running tests with
mpich4.0.3 is broken. Systemmpikinda works, only one test times out.
P. S. sh: /bin/ps: Operation not permitted error is solved like this (adding it to portfile):
pre-test {
# test infrastructure uses /bin/ps, which is forbidden by sandboxing
append portsandbox_profile " (allow process-exec (literal \"/bin/ps\") (with no-profile))"
}
You may consider fixing it in the source code though.
So, could you run the tests via
env CTEST_OUTPUT_ON_FAILURE=1 make test ARGS="--timeout 2000"? I'm guessing, I really need to reproduce it on my side...
@alazzaro How to pass it to ctest? This does not have any effect: ctest test --timeout=2000. (With make it does not work, complaining about missing target for test.)
@alazzaro @dev-zero Interesting that when linking to OpenBLAS, the first unit test passes quickly, while unit tests 2 and 3 fail badly with wrong results:
Start 11: dbcsr_unittest1
11/19 Test #11: dbcsr_unittest1 ....................................... Passed 158.55 sec
Start 12: dbcsr_unittest2
12/19 Test #12: dbcsr_unittest2 .......................................***Failed 0.78 sec
Start 13: dbcsr_unittest3
13/19 Test #13: dbcsr_unittest3 .......................................***Failed 13.93 sec
When linking to vecLibFort or Accelerate, unit tests 2 and 3 pass, but test 1 times out (takes more than 10 times longer!), though no wrong results.
I have noticed that -D__ACCELERATE is not picked automatically – neither with Accelerate nor with vecLibFort. Made a patch to src/CMakeLists now and testing again.
UPD. No difference to test results whatsoever. Whether this flag does smth or not, its absence was not causing the problem.
Other tests run at full load on every core, dbcsr_unittest1 is barely doing something:

Likely the reason behind why it takes 10 times more time (that when run under OpenBLAS) and times out eventually.
You may consider fixing it in the source code though.
AFAIK we're not calling ps directly, therefore this must originate from either MPI or ctest and therefore unlikely to get fixed by us. Also, I would consider ps a rather essential tool to which a test system should have access.
I have noticed that
-D__ACCELERATEis not picked automatically – neither with Accelerate nor with vecLibFort. Made a patch to src/CMakeLists now and testing again.UPD. No difference to test results whatsoever. Whether this flag does smth or not, its absence was not causing the problem.
That's not good. macOS' accelerate has a slightly different LAPACK API (returning double precision for single precision calls). Not defining can therefore lead to wrong results.
We have this:
if (APPLE)
# fix /proc/self/statm can not be opened on macOS
target_compile_definitions(dbcsr PRIVATE __NO_STATM_ACCESS)
if (BLAS_LIBRARIES MATCHES "Accelerate")
target_compile_definitions(dbcsr PRIVATE __ACCELERATE)
endif ()
endif ()
Which doesn't get triggered since you're setting BLAS_LIBRARIES to a generic path (as per the logs you supplied), rather than the usual /Library/Developer/CommandLineTools/SDKs/MacOSX13.1.sdk/System/Library/Frameworks/Accelerate.framework I see in my logs.
I wonder, maybe we should instead check against BLA_VENDOR instead.
AFAIK we're not calling
psdirectly, therefore this must originate from either MPI or ctest and therefore unlikely to get fixed by us. Also, I would considerpsa rather essential tool to which a test system should have access.
Got it. We added a fix in Macports. FWIU, it is a Mac sandboxing issue.
That's not good. macOS' accelerate has a slightly different LAPACK API (returning double precision for single precision calls). Not defining can therefore lead to wrong results. Which doesn't get triggered since you're setting
BLAS_LIBRARIESto a generic path (as per the logs you supplied), rather than the usual/Library/Developer/CommandLineTools/SDKs/MacOSX13.1.sdk/System/Library/Frameworks/Accelerate.frameworkI see in my logs. I wonder, maybe we should instead check againstBLA_VENDORinstead.
@dev-zero Macports standard handling of this is via linear_algebra PG: https://github.com/macports/macports-ports/blob/92f50e73e1ae04fc6ced3ddec56e088edf99ce86/_resources/port1.0/group/linear_algebra-1.0.tcl#L63
So it looks like BLA_VENDOR should work (and it is a common definition, not Macports-specific).
It is perhaps desirable to expand the check to vecLibFort too: https://github.com/mcg1969/vecLibFort – which is used instead of Accelerate directly when Fortran support is needed.
P. S. BLAS and LAPACK dylibs in /usr/lib are symlinks to Accelerate. But specific path to Accelerate differs depending on OS version.
It is perhaps desirable to expand the check to
vecLibForttoo: https://github.com/mcg1969/vecLibFort – which is used instead ofAcceleratedirectly when Fortran support is needed.
According to the documentation this is exactly what vecLibFort is supposed to be fixing.
I wonder whether we shouldn't just start bundling vecLibFort as recommended by the project and then we can forget about our __ACCELERATE guards. @alazzaro what do you think? This might have to be synchronized with CP2K.
According to the documentation this is exactly what
vecLibFortis supposed to be fixing. I wonder whether we shouldn't just start bundlingvecLibFortas recommended by the project and then we can forget about our__ACCELERATEguards. @alazzaro what do you think? This might have to be synchronized with CP2K.
@dev-zero Macports defaults to vecLibFort unless the port explicitly asks not to use it. Do I get it right that we do not need __ACCELERATE macros then at all? (If they are not needed but do not cause issues with vecLibFort, I will keep them for now, just to avoid making another PR to Macports in a row for the same port.)
If you decide to use vecLibFort directly, please consider allowing an external one as well. Many users will have it installed already, whether in Macports or otherwise.
@dev-zero @alazzaro I do not close the issue, since above discussion may be relevant for further improvements (fixing OpenBLAS on Mac, using vecLibFort), but I got all tests passing on PPC:
---> Testing dbcsr
Executing: cd "/opt/local/var/macports/build/_opt_PPCRosettaPorts_math_dbcsr/dbcsr/work/build" && ctest test
Test project /opt/local/var/macports/build/_opt_PPCRosettaPorts_math_dbcsr/dbcsr/work/build
Start 1: dbcsr_perf:inputs/test_H2O.perf
1/19 Test #1: dbcsr_perf:inputs/test_H2O.perf ....................... Passed 329.59 sec
Start 2: dbcsr_perf:inputs/test_rect1_dense.perf
2/19 Test #2: dbcsr_perf:inputs/test_rect1_dense.perf ............... Passed 2.69 sec
Start 3: dbcsr_perf:inputs/test_rect1_sparse.perf
3/19 Test #3: dbcsr_perf:inputs/test_rect1_sparse.perf .............. Passed 12.66 sec
Start 4: dbcsr_perf:inputs/test_rect2_dense.perf
4/19 Test #4: dbcsr_perf:inputs/test_rect2_dense.perf ............... Passed 2.35 sec
Start 5: dbcsr_perf:inputs/test_rect2_sparse.perf
5/19 Test #5: dbcsr_perf:inputs/test_rect2_sparse.perf .............. Passed 9.97 sec
Start 6: dbcsr_perf:inputs/test_singleblock.perf
6/19 Test #6: dbcsr_perf:inputs/test_singleblock.perf ............... Passed 0.71 sec
Start 7: dbcsr_perf:inputs/test_square_dense.perf
7/19 Test #7: dbcsr_perf:inputs/test_square_dense.perf .............. Passed 0.94 sec
Start 8: dbcsr_perf:inputs/test_square_sparse.perf
8/19 Test #8: dbcsr_perf:inputs/test_square_sparse.perf ............. Passed 3.30 sec
Start 9: dbcsr_perf:inputs/test_square_sparse_bigblocks.perf
9/19 Test #9: dbcsr_perf:inputs/test_square_sparse_bigblocks.perf ... Passed 2.78 sec
Start 10: dbcsr_perf:inputs/test_square_sparse_rma.perf
10/19 Test #10: dbcsr_perf:inputs/test_square_sparse_rma.perf ......... Passed 3.53 sec
Start 11: dbcsr_unittest1
11/19 Test #11: dbcsr_unittest1 ....................................... Passed 1235.95 sec
Start 12: dbcsr_unittest2
12/19 Test #12: dbcsr_unittest2 ....................................... Passed 333.61 sec
Start 13: dbcsr_unittest3
13/19 Test #13: dbcsr_unittest3 ....................................... Passed 191.42 sec
Start 14: dbcsr_unittest4
14/19 Test #14: dbcsr_unittest4 ....................................... Passed 1.14 sec
Start 15: dbcsr_tensor_unittest
15/19 Test #15: dbcsr_tensor_unittest ................................. Passed 11.51 sec
Start 16: dbcsr_tas_unittest
16/19 Test #16: dbcsr_tas_unittest .................................... Passed 9.60 sec
Start 17: dbcsr_test_csr_conversions
17/19 Test #17: dbcsr_test_csr_conversions ............................ Passed 22.04 sec
Start 18: dbcsr_test
18/19 Test #18: dbcsr_test ............................................ Passed 0.53 sec
Start 19: dbcsr_tensor_test
19/19 Test #19: dbcsr_tensor_test ..................................... Passed 1.10 sec
100% tests passed, 0 tests failed out of 19
Total Test time (real) = 2175.63 sec
Used MPI_RANKS=2, prevented time-out.