gloo
gloo copied to clipboard
Collective communications library with various primitives for multi-machine training.
Fix compile error for rocm: 1. add python in command line 2. print error code when "Failed to get hipify files list!" error happened.
1. gatherv.h is not installed by the command of `make install` 2. in gloo/gatherv.cc, `GLOO_ENFORCE_GE` is a wrong check, for example: there are two tasks, A task send 3 and...
#### Use-case I have a client and server, each running on a different thread on the same GPU rank. I need to send data from client thread to server thread...
https://github.com/facebookincubator/gloo/blob/e6d509b527712a143996f2f59a10480efa804f8b/gloo/common/linux.cc#L201 I think it can be subs with void *
hi, I wrote a communication framework on our company's self-developed GPGPU, using the IB interface of GLOO. when using torch.utils.data.dataloader which forks many processes. I got following error: ```gloo/transport/ibverbs/pair.cc:438] wc->status...
These patches were necessary to cross-compile for mingw32. Cf. * https://github.com/JuliaPackaging/Yggdrasil/pull/4570 * https://github.com/JuliaPackaging/Yggdrasil/pull/4571 * https://github.com/JuliaPackaging/Yggdrasil/pull/5352
Hi, when I am using GLOO AlltoAll in my work, I find the performance is much slower than expected. Here is a test in my environment. rank_num : 2 element_per_rank...
When performing an alltoallv message exchange on cpus results in the following error -------------------------- terminate called after throwing an instance of 'gloo::EnforceNotMet' what(): [enforce fail at ../third_party/gloo/gloo/transport/tcp/pair.cc:490] op.preamble.length
Summary: X-link: https://github.com/facebook/CacheLib/pull/137 Fix error: anonymous non-C-compatible type given name for linkage purposes by alias declaration; add a tag name here [-Werror,-Wnon-c-typedef-for-linkage] Reviewed By: philippv Differential Revision: D36043476
Changes to add hipify_torch (hipification for amd-build) as submodule Build - passes build/gloo/test/gloo_test - passes This is a companion PR to https://github.com/pytorch/pytorch/pull/74704, which adds hipify_torch submodule to pytorch