ADIOS2 icon indicating copy to clipboard operation
ADIOS2 copied to clipboard

Spack spec for SST over RDMA?

Open MichaelLaufer opened this issue 4 years ago • 3 comments
trafficstars

Re. #2887

I have been trying to get a working configuration of ADIOS2 with SST with RDMA, but unfortunately it keeps falling back to WAN. I understand that only some versions of libfabric have been known to work.

I am using Spack for all software installations. Do you guys have a Spack spec that you know will allow this work out of the box? Ideally I would like a configuration that does not require a proprietary, vendor specific MPI library. For whats its worth I am using Mellanox ConnectX-5 NICs.

Hopefully this will help others looking to get up and running with SST.

Many thanks, Mike

MichaelLaufer avatar Oct 26 '21 19:10 MichaelLaufer

The "known to work" part is the difficult bit. Unfortunately the RDMA transport in SST is built on libfabric and building libfabric in spack can be an issue. For example, if you simply build libfabric on Summit, it wants to build its own copy of one of the system dependencies (at the moment I don't recall if its libibverbs or librdmacm). If you let it do that, you'll get a library that won't work when linked with MPI (which has to use the system version). On other machines there's partial verbs library that is sufficient to make libfabric autoconf think that verbs is supported, so that if you don't specifically disable the verbs provider when building libfabric at configuration time, you get a library that doesn't work. That's just the build-time stuff. There are more issues with libfabric requiring run-time environment variables in some circumstances to help it find the right devices, etc. So while we'd love to have a simple set of instructions that would let you build and run with RDMA on any machine, we're not there yet. What we can do now is to offer to help, and maybe try to develop some more specific README's on things to try. If you're falling back to WAN, then often the first step is to set the SstVerbose environment variable so we can get some information out about why that choice is being made. If you set SstVerbose=5, you may get a lot of output, but it gives us the most detail.

eisenhauer avatar Oct 28 '21 13:10 eisenhauer

Thanks @eisenhauer, I will try a few more configurations and report back with findings/questions. Our setup is pretty generic using pretty generic HW as well as software stacks, so hopefully I will be able to find the right combination,

MichaelLaufer avatar Oct 29 '21 07:10 MichaelLaufer

Please don't hesitate to post when you hit obstacles. If nothing else, understanding the problems experienced on different hardware can only help us towards a universal build solution...

eisenhauer avatar Oct 29 '21 15:10 eisenhauer