Understanding Elemental's Performance
Hi,
I am trying to understand the performance of this program at NERSC-- it is basically the same as the example in the README.md, except that I addprocs currently doesn't work, so I am using this (manual) approach of running the MPIClusterManager using start_main_loop, and stop_main_loop
N = parse(Int64, ARGS[1])
# to import MPIManager
using MPIClusterManagers
# need to also import Distributed to use addprocs()
using Distributed
# Manage MPIManager manually -- all MPI ranks do the same work
# Start MPIManager
manager = MPIClusterManagers.start_main_loop(MPI_TRANSPORT_ALL)
@mpi_do manager begin
using MPI
comm = MPI.COMM_WORLD
println(
"Hello world,"
* " I am $(MPI.Comm_rank(comm)) of $(MPI.Comm_size(comm))"
* " on node $(gethostname())"
)
println("[rank $(MPI.Comm_rank(comm))]: Importing Elemental")
using LinearAlgebra, Elemental
println("[rank $(MPI.Comm_rank(comm))]: Done importing Elemental")
println("[rank $(MPI.Comm_rank(comm))]: Solving SVD for $(N)x$(N)")
end
@mpi_do manager A = Elemental.DistMatrix(Float64);
@mpi_do manager Elemental.gaussian!(A, N, N);
@mpi_do manager @time U, s, V = svd(A);
@mpi_do manager println(s[1])
# Manage MPIManager manually:
# Elemental needs to be finalized before shutting down MPIManager
@mpi_do manager begin
println("[rank $(MPI.Comm_rank(comm))]: Finalizing Elemental")
Elemental.Finalize()
println("[rank $(MPI.Comm_rank(comm))]: Done finalizing Elemental")
end
# Shut down MPIManager
MPIClusterManagers.stop_main_loop(manager)
I ran some strong scaling tests on 4 Intel Haswell nodes (https://docs.nersc.gov/systems/cori/#haswell-compute-nodes) using a 4000x4000, 8000x8000, and 16000x16000 random matrix.

I am measuring only the svd(A) time. I am attaching my measured times, and wanted to check if this is what you would expect. I am not an expert in how Elemental computes SVDs in a distributed fashion, and so would would be grateful for any advise you have for optimizing this benchmark's performance. In particular, I am interested in understanding what the optimal number of ranks are as a function of problem size (I am hoping that this is such an obvious questions, that you can point me to some existing documentation).
Cheers!
First, it might be useful to confirm that the same pattern shows up with you try to compile a C++ version of this problem.
That was what I was thinking. Unfortunately I am not familiar with how to use Elemental, and the docs hosting seems to be broken (and I can't find the docs sources either). Do you know where I can find a copy of the full docs? I am looking for the C++ equivalent of: Elemental.DistMatrix, Elemental.gaussian!, and svd, so that I can replicate the example above in C++.
I am able to build libEl
Cheers, Johannes
It looks like you can still browse the html version of the documentation although it doesn't render correctly. I think the best place for you to look is https://github.com/LLNL/Elemental/blob/hydrogen/tests/lapack_like/SVD.cpp#L157. It should be possible adapt that test to something similar to the example above.
Thanks for the blob -- I'll try to understand it given the docs that I can find. At this point I only understand 10%. Btw, not all of the docs can be browsed: https://elemental.github.io/documentation/0.85/core/dist_matrix.html
The source for the documentation is at https://github.com/elemental/elemental-web. I've asked your colleague at LLNL if they could start hosting the docs since they are already maintaining the fork of Elemental, https://github.com/LLNL/Elemental/issues/80#issuecomment-937447310.
Thanks! I'll also look into hosting that locally.
FTR: NERSC is at LBNL, and LBNL != LLNL. It's a common misunderstanding, and we are all friends.
I had the pleasure of spending some days at NERSC a couple of years ago while working on a project where we ran Julia code on Cori so I'm well aware that it's two different labs. The "colleagues" was in the sense that you both are under DOE. The folks at Livermore forked Elemental a couple of years ago so it would make sense for them to host the documentation but if you don't mind doing it would also be great.