gloo
gloo copied to clipboard
Collective communications library with various primitives for multi-machine training.
Do not hardcode /include and /lib as these are system dependent so may want to easily changed them during configuration. https://github.com/facebookincubator/gloo/issues/369
Use CMake variables instead of /lib and /include
OpenSSL 1.x reaches end-of-life in September, and recent distros like Ubuntu 22.04+ (last year) and Debian 12+ (next month) ship only OpenSSL 3. I have gloo (inside PyTorch) working with...
The Gloo example code has a problem that it leaves the temporary file generated under tmp in previous execution. That file generate an error in next execution. Therefore, I create...
Hi team, I successfully compiled gloo on ``MacOS`` by setting ``USE_LIBUV ON``, but when I test the ``reduce_scatter`` OP, I found that core dump at runtime. I use pybind11 to...
This patch was necessary to enable cross-compilation for *-linux-musl Cf. https://github.com/JuliaPackaging/Yggdrasil/pull/5352
Summary: ProcessGroupGloo and gloo seem to be opening and closing sockets without allowing the port to be reused. We see this issue pop up in larger training jobs "Address already...
The PR #346 reorder two elements in a structure, whose meaning was actually to define "first structure + buffer at the end". Those fields cannot be reordered, but an alternative...