Open-Assistant
Open-Assistant copied to clipboard
Adding support for 8-bit training with bitsandbytes
Adding bitsandbytes dependency to requirements.txt.
Using the currently unused quantization option in the config file to set whether to use the BNB 8-bit optimizer.
Details on using the 8-bit optimizer in HF here and detailed discussion on HF implementation here.
Note the part about forcing the embedding layer to use the 32-bit optimizer. Doing that as suggested in the link above:
For existing pre-trained transformers models one could use them as is and use 8-bit optimizers for all weights, but 32-bit optimizers for the embedding layer.
This override code might require a bit more work if we choose to use a model that has a non-standard embedding layer.
Thanks a lot, looks great! Can you also run the pre-commit for the final commit?
@sanagno Yes, fixed.