bosilca
bosilca
The accelerator collective module (which allocates and moves the data onto the host in order to complete collective communications) has a priority higher than some collective modules that do natively...
I'm puzzled by this code working. It does all the right things, but it was calling PML->add_procs with zero new procs, ALWAYS (the `ilist` was empty by the time we...
Don't build the smcuda BTL if there are no accelerators available. Right now I added support for CUDA, extend it for other accelerators you can about (@edgargabriel and @hppritcha).
Add support for sending and receiving the data directly from and to devices. There are few caveats (noted on the commit log). Note: because it includes the `span` renaming, this...
The idea is the following: - tasks incarnations (aka. BODY) can be marked with the "batch" property allowing the runtime to provide the task with the entire list of ready...
Introduce the parsec_gpu_flow_info_s info structure to combine the flow information needed by the GPU code. Allow the standard device tasks (aka. parsec_gpu_dsl_task_t) to contain the flow_info array inside the task,...
This is part of a multi-project effort, a similar PR will be created in OpenPMIX and OMPI. The goal of each of these changes is the same: instead of using...
@mentOS31 correctly identified few issues with the handling of persistent requests with regard to their error codes (more info in the issue). This PR fixes all wait and test function...