Pieter Noordhuis issues

Results 56 issues of


                                            Pieter Noordhuis

Reenable multi-process transport tests

These were disabled in #230 because they all fail when running consecutively. When run independently, they appear to pass...

CLA Signed

Take CUDA peer access into account for on-device reduction

The NVLink cube mesh architecture has partial peer access between devices. Two groups of 4 GPUs have full peer access and every GPU in one group has peer access to...

CLA Signed

Use a single listening socket per device

Stack from [ghstack](https://github.com/ezyang/ghstack): * **#243 Use a single listening socket per device** * #242 Add error class * #241 Add RAII wrapper for socket * #240 Allow deferring functions to...

CLA Signed

Allow deferring functions to the epoll(2) thread

Stack from [ghstack](https://github.com/ezyang/ghstack): * #243 Use a single listening socket per device * #242 Add error class * #241 Add RAII wrapper for socket * **#240 Allow deferring functions to...

CLA Signed

Add RAII wrapper for socket

Stack from [ghstack](https://github.com/ezyang/ghstack): * #243 Use a single listening socket per device * #242 Add error class * **#241 Add RAII wrapper for socket** * #240 Allow deferring functions to...

CLA Signed

Add error class

Stack from [ghstack](https://github.com/ezyang/ghstack): * #243 Use a single listening socket per device * **#242 Add error class** * #241 Add RAII wrapper for socket * #240 Allow deferring functions to...

CLA Signed

Don't busy-spin in tcp transport's event loop

Per @jjlilley in https://github.com/facebookincubator/gloo/pull/237#discussion_r356780531, we can use an `eventfd(2)` to avoid busy-spinning the epoll loop. If we do, we must also update the code that unregisters an fd to either:...

Document send/recv tally

The `notify_send_ready` and `notify_recv_ready` messages used in the tcp backend (and future uv backend, see #195) need better documentation. The protocol how these are sequenced as well.

New context type that can dispatch to N actual contexts

This is what we do in PyTorch upstream today but it would be good to move the functionality into Gloo. This would be a new type of context that wraps...

enhancement

Allreduce using bcube algorithm caveats

The comments mention it is usable for any `#nodes == c * base ^ x`, for any `c >= 1`, `base >= 2`, and `x >= 1`, but in reality...