William Zhang
William Zhang
Need to: Rework component selection logic Replace code that calls common_cuda that can call the framework device agnostically instead Do more testing
Looks like the PR build checker failed due to jenkins losing connection, will fix itself on a re-run
I changed the datatype engine, mtl's, common/ompio, osc/rdma, and mtls to use the new framework. The remaining dependencies on common_cuda.h are in: - coll_cuda component - nbc_internal - pml ob1...
Added new asynchronous API's, Seth added a null component, rewrote the asynchronous progress engine in the ob1 pml, replaced all the OPAL_CUDA_SUPPORT ifdefs in the pml and btls (non cuda)....
> The failures look real; the datatype unit tests are failing with a segmentation fault. Yeah, I think I messed something up in my pml ob1 code conversion, I'll fix...
> Would it be a win to have a accelerator.rst explaining the design, i.e., streams and events. > > Sorry, I meant to say: would an introduction in accelerator.h help?...
TODOs: 1. rm -rf opal/cuda -> move all usages of cuda in OMPI to Accelerator Framework - IN PROGRESS 2. ~~Figure out how to handle multiple Accelerators (of same type)...
> * What are the configure parameters to build OMPI with accelerator component? --with-cuda= > * What are the runtime parameters needed to toggle between accelerator instances? Not sure if...
Is someone familiar with the mellanox CI? The failure looks real but I'm not sure what exactly it is
Is this the sort of behavior you're trying to implement? ``` size_t len = 1; char tmp_addr[1]; fi_getname(fid, tmp_addr, &len); char *addr = malloc(len); fi_getname(fid, addr, &len); ```