distributed-deep-learning topic

List distributed-deep-learning repositories

dnn-distributed

43
Stars
12
Forks
Watchers

Distributed training of DNNs • C++/MPI Proxies (GPT-2, GPT-3, CosmoFlow, DLRM)

Ok-Topk

20
Stars
8
Forks
Watchers

Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k communication volume which is asymptotically optimal) with the d...