James Dinan
James Dinan
As Brian said, MPI applications follow the GDR rules above, so CUDA deals with the GPU's memory model gotchas for you. NCCL bends some of these rules -- notably, it...
@mwheinz Yes, I'm involved in this work. There's an issue in the libfabric stack that we haven't been able to identify. We aren't able to reproduce the cudaDeviceFlushGPUDirectRDMAWrites hang with...
The Open UCX team has started maintaining their own fork of XPMEM. Your issue might get more attention if you post it there: https://github.com/openucx/xpmem
@hjelmn As a user, it would be great to have a single source for XPMEM. I don't have an opinion on who should own the repository, but would prefer to...
This is a good change. Would you be willing to post a PR for it? Can you elaborate on the symmetric heap size limitation you ran into?
Got it -- happy to accept any additional patches needed to make this work.