Enhancement: detach dtype for prompt embeddings from the model itself

Open mayank31398 opened this issue 2 years ago • 1 comments

I think right now, the dtype of prompt embeddings and the model are tied together since the weights are copied. It would be nice to have a different dtype for prompt embeddings.

This is for better mixed precision training since the model itself doesn't need to be in fp32, only the prompt embeddings since only they are trained.

Let me know your thoughts @pacman100

Feb 09 '23 06:02 mayank31398

Opened a PR for this

Feb 10 '23 17:02 mayank31398