phantom icon indicating copy to clipboard operation
phantom copied to clipboard

OpenMP-parallel MPI send

Open conradtchan opened this issue 2 years ago • 0 comments

Work in progess

Type of PR: modification to existing code

Description:

Profiling of MPI runs has shown that the OMP critical sections for sending cells over MPI is a bottleneck. Threads wait a significant amount of time for other threads to finish sending.

This PR implements an individual send/receive buffer for each OMP thread in the form of a threadprivate variable, allowing it to call the MPI send/receive independently. The receive stacks are still shared, but OMP atomic operations are used to write to them instead of critical sections, which significantly improves performance.

A consequence is that multiple cells may be waiting to be received from a given MPI task, so the receive method is modified to loop over all waiting receives, rather than just receive one cell. The receive method is contained in an OMP single section to prevent threads from attempting to receive back-to-back, since the first thread will have processed all of the pending receives.

The mcmodel=medium flag needs to be removed because it causes it causes a section type conflict with the GCC compiler. This flag has not been necessary since dynamic memory allocation was implemented.

Testing: Describe how you have tested the change

Did you run the bots? yes/no

conradtchan avatar Aug 01 '22 02:08 conradtchan