ompi icon indicating copy to clipboard operation
ompi copied to clipboard

pml/ob1: introduce a new protocol for intermediate-sized messages

Open hjelmn opened this issue 2 weeks ago • 1 comments

This commit introduces the multi-eager protocol to ob1. This protocol works by fragmenting into multiple eager-sized messages and sending them in parallel to the destination. On the receiver the first fragment is matched against a posted receive if one exists. If a receive is matched then each incoming multi- eager packet is copied directly into the user buffer without additional buffering in ob1. Once all fragments have arrived the receive request is marked complete. If the message is unexpected it is buffered until all fragments have arrived then processed as a large eager message.

Usage of this protocol is disabled by default and is enabled by setting a BTL's multi_eager_limit larger than its eager_limit. When enabled ob1 will use the new protocol for messages that are larger than the eager limit but smaller than the multi_eager_limit. At that point ob1 will switch to doing a full rendezvous.

This protocol is inspired by the multiple send eager protocol used by OpenUCX. It can provide lower latency communication at a cost of additional resources for in-flight messages vs the various rendezvous protocols because it doesn't wait for an ack from the receiver and does not make use of either RDMA read or RDMA write. The cost is highest for non-contiguous data on the sender and unexpected receives on the receiver.

This commit also re-organizes the match code so that multi-eager can make use of the same code.

hjelmn avatar Nov 21 '25 20:11 hjelmn