Megatron-DeepSpeed icon indicating copy to clipboard operation
Megatron-DeepSpeed copied to clipboard

a branch combining layer-norm-auto-sync and ds_ckpt_reshape

Open stas00 opened this issue 3 years ago • 0 comments

as we have 2 branches that aren't ready for the main yet, but we need both of them - this branch is a merge of the 2 - so will use that one for production run for now so that it'll be ready for checkpoint reshaping when we need it to.

I merged https://github.com/bigscience-workshop/Megatron-DeepSpeed/pull/272/ into https://github.com/bigscience-workshop/Megatron-DeepSpeed/pull/239

stas00 avatar Jun 29 '22 19:06 stas00