byaldi
byaldi copied to clipboard
Added BitsAndBytes Support
Hello Byaldi Team!
Description
I added BitsAndBytes support for all us GPU-poor people. This enables 4-bit/8-bit quantization to run the models on smaller GPUs or, in my case, leave space for a bigger LLM.
Changes Made
- Added BitsAndBytes quantization options to model loading
- Updated dependencies to include bitsandbytes
- Added quant_strategy in the example notebook
Testing
- I am using Byaldi in a commercial setting and the 4-bit quantization didn't affect performance.
I found a typo