Exceeding max threads with IBM mpi on multiple reads
Describe the problem you're observing
On a system using IBM's mpi, when consecutively reading (either using writeread example, or write then read examples) enough files to equate to a total of what would be 64 clients on a single host having read from files, the server crashes and causes the client to hang in https://github.com/LLNL/UnifyCR/blob/9d47738b15a5f939861a01788ca1872d1007ed1c/client/src/unifycr-sysio.c#L1927-L1929
after invoke_client_read_rpc has been called.
Walking through with a debugger allowed for a more controlled failure that gave the error:
[host:pid] ERROR: Exceeded maximum supported application threads (64). Use common_pami_max_threads MCA parameter to increase this limit
Error doesn't occur on a system using MVAPICH but thread limit might simply be higher (haven't tried other mpi libraries).
Describe how to reproduce the problem
On a system using IBM's mpi, start the server:
export UNIFYCR_DAEMONIZE=off
export UNIFYCR_LOG_VERBOSITY=5
export UNIFYCR_SPILLOVER_META_DIR=/path/to/ssd
export UNIFYCR_SPILLOVER_DATA_DIR=/path/to/ssd
/path/to/install/bin/unifycr start -S /path/to/shared/filesystem -e /path/to/UnifyCR/install/bin/unifycrd &
Then consecutively write and then read enough files to equate what would be 64 clients on a single host having read (i.e., 2 clients per host -> 32 files; 4 clients per host -> 16 files, etc). Can use writeread example, or write and then read examples (static or gotcha). Happens both when using same file size each time, or changing the file sizes each time.
Example using writeread example and 4 clients per host with same file size:
for i in {1..16}
do
echo $i
jsrun -n1 -a4 -c18 -r1 /prefix/UnifyCR/install/libexec/writeread-static -p n1 -n 64 -c 1048576 -b 4194304 -a $i -k -v -m /unifycr -f writeread-static_pn1_n64_c1MB_b4MB.app.$i 2> err.out
done
After consecutively writing and then reading 15 files with the same server running, the 16th causes the server to crash and the clients to hang.
Initial Thoughts
- Happening with IBM's mpi (not MVAPICH, but thread limit might simply be higher - haven't tested other mpi libraries)
- Could try increasing
common_pami_max_threads - Are we freeing threads correctly after reading?
- Consider using a thread pool instead of thread-per-process
- Problem might simply go away when we remove mpi