gloo icon indicating copy to clipboard operation
gloo copied to clipboard

Collective communications library with various primitives for multi-machine training.

Results 90 gloo issues
Sort by recently updated
recently updated
newest added

- Changes to control hipify of CUDA_VERSION to HIP_VERSION - use GLOO_USE_ROCM instead of __HIP_PLATFORM_HCC__ - Adding __HIP_PLATFORM_AMD__ since __HIP_PLATFORM_HCC__ is being deprecated.

CLA Signed

Summary: MultiProc tests does not do multiprocessing error catching thoroughly. This diff plugs some of the holes and includes better logging upon failures. Differential Revision: D26186660

CLA Signed
fb-exported

When trying to build the lib on ubuntu with cmake using clang++-11 with libc++, the following error occurs: /home/lib/pytorch/third_party/gloo/gloo/transport/tcp/device.cc:152:39: error: implicit instantiation of undefined template 'std::__1::array' std::array hostname; ^ /usr/lib/llvm-10/bin/../include/c++/v1/__tuple:219:64:...

CLA Signed

This clears the warning: CMake Warning: The package name passed to `find_package_handle_standard_args` (RCCL) does not match the name of the calling package (rccl). This can lead to problems in calling...

Summary: Add alltoall and alltoallv to Gloo Differential Revision: D21873282

CLA Signed
fb-exported

to avoid collision with variable in RCCL cmake file. This should fix the error about not finding "-lrccl" in https://github.com/pytorch/pytorch/pull/31341 (now refiled as https://github.com/pytorch/pytorch/pull/34683)

CLA Signed

Let's see what happens...

CLA Signed

These were disabled in #230 because they all fail when running consecutively. When run independently, they appear to pass...

CLA Signed

The NVLink cube mesh architecture has partial peer access between devices. Two groups of 4 GPUs have full peer access and every GPU in one group has peer access to...

CLA Signed

For Gloo in Pytorch distributed, as shown in this document https://pytorch.org/docs/stable/distributed.html, will the following code get performance benefits of using CUDA-aware MPI? (e.g., GPU-to-GPU transferring via PCIe while bypassing CPU)...