ompi icon indicating copy to clipboard operation
ompi copied to clipboard

btl: ofi add RMA/ATOMIC caps for one-sided to work

Open naughtont3 opened this issue 1 year ago • 6 comments

naughtont3 avatar Jan 30 '24 16:01 naughtont3

@naughtont3 I'm curious. We already request the caps here - isn't that enough?

wenduwan avatar Jan 30 '24 16:01 wenduwan

MCA_BTL_OFI_ONE_SIDED_REQUIRED_CAPS is only set in the hints.caps

cxi provider checks the tx_attr.caps for these capabilities explicitly. Therefore, if not set then the operation fails.

One update to the patch is to use the define instead of explicitly specifying the caps

amirshehataornl avatar Jan 30 '24 19:01 amirshehataornl

It could be that I'm running with older CXI code. With the version of the code I run with. The context is allocated with the caps passed in from MPI fi_tx_context() call. That's 0. Therefore the internal capabilities stored against the cxi transmit context is 0. When fi_read() is called it checks that RMA/ATOMIC capabilities are set and because they are not the fi_read() fails.

I'm looking at the code that they have in a PR in libfabric repo and it's very different than what I have.

How did you verify that the CXI provider works with one-sided? IE which code did you use?

amirshehataornl avatar Feb 01 '24 16:02 amirshehataornl

osu 5.8.0 one-sided unit tests on crusher

hppritcha avatar Feb 01 '24 17:02 hppritcha

between two nodes

hppritcha avatar Feb 01 '24 17:02 hppritcha

ok. i brought the issue that I saw to the attention of the CXI developers. We'll see what they say.

amirshehataornl avatar Feb 01 '24 18:02 amirshehataornl

can we close this? This is trying to work around problems in pre HPE SS11 2.2.x releases of CXI provider.

hppritcha avatar Feb 28 '24 22:02 hppritcha

If we are sure it's something which has already been fixed in cxi provider then I'm good with closing it. I haven't heard back from them about it.

amirshehataornl avatar Feb 28 '24 22:02 amirshehataornl

Closing due to inactivity. Please reopen if you still need it.

wenduwan avatar Mar 21 '24 21:03 wenduwan