zero-1 topic

List zero-1 repositories

pipegoose

74
Stars
17
Forks
Watchers

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*