gloo icon indicating copy to clipboard operation
gloo copied to clipboard

Collective communications library with various primitives for multi-machine training.

Results 90 gloo issues
Sort by recently updated
recently updated
newest added

I noticed that the constructor does not take a reduction function, and there's no way to set it. https://github.com/facebookincubator/gloo/blob/master/gloo/cuda_allreduce_halving_doubling.h#L75 Is this intended?

This would verify that we copy all the headers we need to copy in the install step.

Otherwise REALLY weird errors pop up (see for example https://github.com/pytorch/pytorch/issues/2835)

Hi! Firstly, thanks for the nice work. It's good to see the brief benchmark figures in README.md. It would be great if anybody can show the benchmarking result of `--transport...

Do this instead of assuming hostname(2) is resolvable. This is typically not the case on people's custom Ubuntu installs and whatnot. We keep the API the same but just change...

enhancement

Two computers, one is ubantu and the other is win11. Pytorch 2.3.0 is used for distributed training model. Since win11 does not support nccl mode, gloo is used, but the...

Hi, Can Gloo use libfabric ? I see it has ibverbs to be used as transport? why not libfabric to allow all types of transport?

Hi, all. I am the maintainer of vcpkg. Recently we received a build error regarding gloo. https://github.com/microsoft/vcpkg/issues/38852 **Reproduce**: ``` git clone https://github.com/microsoft/vcpkg cd vcpkg/ ./vcpkg install gloo:x64-linux ``` **Error**: ```...

CLA Signed

cstdint for uint8_t need to be included explicitly when compiling with GCC 15

CLA Signed