ch4/ofi: refactor MPIDI_OFI_request_t
Pull Request Description
Having multiple non-intercepting code paths to use the same structure fields in request makes code difficult to read. This PR separates the fields into distinct union members with more intuitive names in order to make the code more readable.
Before this PR the fields are defined as:
typedef struct {
struct fi_context context[MPIDI_OFI_CONTEXT_STRUCTS]; /* fixed field, do not move */
int event_id; /* fixed field, do not move */
MPL_atomic_int_t util_id;
MPI_Datatype datatype;
int nic_num; /* Store the nic number so we can use it to cancel a request later
* if needed. */
enum MPIDI_OFI_req_kind kind;
union {
struct fid_mr **send_mrs;
void *remote_info;
} huge;
union {
struct {
void *buf;
size_t count;
MPI_Datatype datatype;
char *pack_buffer;
} pack;
struct iovec *nopack;
} noncontig;
union {
struct iovec iov;
void *inject_buf; /* Internal buffer for inject emulation */
} util;
...
It is quite confusing on which paths are using which fields and which paths are intercepting. After this PR:
typedef struct {
struct fi_context context[MPIDI_OFI_CONTEXT_STRUCTS]; /* fixed field, do not move */
int event_id; /* fixed field, do not move */
MPL_atomic_int_t util_id;
MPI_Datatype datatype;
int nic_num; /* Store the nic number so we can use it to cancel a request later
* if needed. */
enum MPIDI_OFI_req_kind kind;
union {
/* send path */
struct {
void *pack_buffer;
} pack_send;
struct {
struct iovec *iovs;
} nopack_send;
struct {
struct fid_mr **mrs;
void *pack_buffer;
} huge_send;
struct {
int vci_local;
int ctx_idx;
fi_addr_t remote_addr;
uint64_t cq_data;
uint64_t match_bits;
int pipeline_tag;
int num_remain;
} pipeline_send;
struct {
void *inject_buf;
} am_inject_emu;
/* The recv path can be uncertain depend on the actual send path,
* thus some fields are significant and need be preset to NULL.
*/
struct {
void *remote_info; /* huge path if not NULL */
char *pack_buffer; /* need unpack if not NULL */
void *buf;
MPI_Aint count;
MPI_Datatype datatype;
struct iovec msg_iov; /* FI_CLAIM require fi_trecvmsg which require usage of iov.
* We always set it with {recv_buf, data_sz} since they are
* useful for the huge recv path as well.
*/
} recv;
struct {
struct iovec *iovs;
} nopack_recv;
struct {
int vci_local;
int ctx_idx;
fi_addr_t remote_addr;
uint64_t match_bits;
uint64_t mask_bits;
MPI_Aint offset;
int pipeline_tag;
int num_inrecv;
int num_remain;
bool is_sync;
void *buf;
MPI_Aint count;
MPI_Datatype datatype;
} pipeline_recv;
} u;
} MPIDI_OFI_request_t;
The union fields are not intercepting and clearly identifies the code paths. [skip warnings]
Author Checklist
- [x] Provide Description Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
- [x] Commits Follow Good Practice
Commits are self-contained and do not do two things at once.
Commit message is of the form:
module: short descriptionCommit message explains what's in the commit. - [x] Passes All Tests Whitespace checker. Warnings test. Additional tests via comments.
- [x] Contribution Agreement For non-Argonne authors, check contribution agreement. If necessary, request an explicit comment from your companies PR approval manager.
test:mpich/ch4/ofi
direct-nm had weird signal 7 during init. Retest to confirm -
test:mpich/ch4/ofi
This request object changes look good. I think I had a few comments on the previous PR that I need to revisit before this can go in.