litgpt icon indicating copy to clipboard operation
litgpt copied to clipboard

bnb optimizers could use bnb.nn.StableEmbedding instead of torch.nn.Embedding

Open mtasic85 opened this issue 1 year ago • 1 comments

According to bnb documentation here:

https://huggingface.co/docs/bitsandbytes/main/optimizers https://huggingface.co/docs/bitsandbytes/main/explanations/optimizers#stable-embedding-layer

This line could alter between bnb.nn.StableEmbedding and torch.nn.Embedding, or allow it to be configurable in config file: https://github.com/Lightning-AI/litgpt/blob/a8aa4bae5043b81b0b5e54bed838d1b57e1e1fe7/litgpt/model.py#L28

There are also other places in code where torch.nn.Embedding is used.

mtasic85 avatar Oct 03 '24 18:10 mtasic85

Thanks for the note and good point, I didn't know about this.

One challenge I see with configuring it in the config file is that it's used to model creation. But one can later optionally run with --quantize bnb.nf4 or not. So, ideally, that swap should only take place upon calling the inference/training functions and leave the original model as is.

rasbt avatar Oct 03 '24 19:10 rasbt

Upon reading a bit more, this would only be required for training (due to the optimizer choice). I added it in #1770

rasbt avatar Oct 04 '24 13:10 rasbt