gpt-neox icon indicating copy to clipboard operation
gpt-neox copied to clipboard

Add Mixture of Experts

Open sdtblck opened this issue 3 years ago • 0 comments

from DeepSpeed-MoE for NLG: Reducing the training cost of language models by 5 times .

It should be a fairly simple addition as the codebase they open source is largely similar to ours (same base model, although we have diverged a bit since).

sdtblck avatar Dec 12 '21 21:12 sdtblck