Open-Assistant Adding support for 8-bit training with bitsandbytes

Adding support for 8-bit training with bitsandbytes

Open mrcabbage972 opened this issue 2 years ago • 1 comments

Adding bitsandbytes dependency to requirements.txt.

Using the currently unused quantization option in the config file to set whether to use the BNB 8-bit optimizer.

Details on using the 8-bit optimizer in HF here and detailed discussion on HF implementation here.

Note the part about forcing the embedding layer to use the 32-bit optimizer. Doing that as suggested in the link above:

For existing pre-trained transformers models one could use them as is and use 8-bit optimizers for all weights, but 32-bit optimizers for the embedding layer.

This override code might require a bit more work if we choose to use a model that has a non-standard embedding layer.

Jan 10 '23 03:01 mrcabbage972

Thanks a lot, looks great! Can you also run the pre-commit for the final commit?

Jan 10 '23 10:01 sanagno

@sanagno Yes, fixed.

Jan 11 '23 02:01 mrcabbage972

Open-Assistant Open-Assistant copied to clipboard

Adding support for 8-bit training with bitsandbytes

Open-Assistant
Open-Assistant copied to clipboard