Brendan Cunningham
Brendan Cunningham
> @BrendanCunningham could you check if this works with OPX provider? @tmh97 Can you take a look at this? @hppritcha I don't work on OPX myself; I maintain PSM2. Thomas...
It looks like the bug that my PR fixed was fixed in commit be3cd9abcb1103115ae6c3c92d8fc4ff5c912f77.
@schung-amd > Hi @BrendanCunningham, while the driver is not yet public, is it possible to provide some code with your caching logic that reproduces the issue? Yes. I misspoke before;...
> Thanks for the quick response! > > > Yes. I misspoke before; our AMD DMA code is not ready but it is public. [hfi1/pin_amd.c](https://github.com/cornelisnetworks/opa-distro-drivers/blob/rhel9.3/drivers/infiniband/hw/hfi1/pin_amd.c) has our send-side AMD DMA...
@schung-amd Also, is it safe to call `rdma_put_pages()` from an interrupt context? My workaround involves calling `rdma_put_pages()` in our hardware completion handler, which is called via interrupt.
> Hi @BrendanCunningham, thanks for following up on this! Sorry for the delay, I'm trying to collect more information from our internal teams before providing answers because I don't have...
> @BrendanCunningham Still gathering information re: calling rdma_put_pages() from an interrupt context; the internal team initially recommends against calling it, but is digging into the code to check. > >...
Here are `pr_debug()` printout logs from two hosts in a 2 rank, 2 host job with our driver (`hfi1`) with our AMD DMA cache enabled: * [opx-node1.txt](https://github.com/user-attachments/files/17018785/opx-node1.txt) * [opx-node2.txt](https://github.com/user-attachments/files/17018786/opx-node2.txt) Note...
> Thanks for the logs! I'll pass them on to the internal team for more insight. As discussed, I wouldn't expect the callback to be called anywhere, as the internal...
@ddalessa no corrections; that is a good summary