Megatron-DeepSpeed icon indicating copy to clipboard operation
Megatron-DeepSpeed copied to clipboard

WIP: Shared t5 code

Open thomasw21 opened this issue 3 years ago • 0 comments

requires: https://github.com/microsoft/DeepSpeed/pull/2035

TODO:

  • [x] Make sure we can run shared enc/dec with MLM
  • [x] Add test making sure that it runs. with MLM
  • [ ] Make sure we can load a BLOOM 6b model in a single A100 GPU
  • [ ] Make sure we can load a BLOOM model checkpoint (+ checkpoing manipulation script to share)

thomasw21 avatar Jun 21 '22 15:06 thomasw21