litgpt icon indicating copy to clipboard operation
litgpt copied to clipboard

Add support for MPT

Open debackerl opened this issue 2 years ago • 2 comments

It would be interesting to add support for MPT models. They are maybe the only one with ALiBi encoding, and the new MPT-30B model supports 8k context length.

Thanks!

debackerl avatar Jun 23 '23 20:06 debackerl

+1

louisoutin avatar Jul 04 '23 14:07 louisoutin

I looked into implementing this (branch). The missing pieces are:

  • ALiBi
  • Low precision LayerNorm

And to reproduce training, they also do

  • Tied embeddings weights with lm_head
  • kaiming normal initialization

carmocca avatar Jul 04 '23 15:07 carmocca