Younes B comments

Results 518 comments of


                                            Younes B

`bitsandbytes` - `Linear8bitLt` integration into `transformers` models

cc-ing also @michaelbenayoun in case you want to have a look as well ;)

`bitsandbytes` - `Linear8bitLt` integration into `transformers` models

Hi all! Just to summarise a bit about what is happening and the solution we came up to implement this! In the previous version, we found out 2 major bugs:...

`bitsandbytes` - `Linear8bitLt` integration into `transformers` models

Thank you very much for your comments! `has_fp16_weights` comes from the class `bnb.Int8Params` that is currently being developed in a WIP branch that should be merged soon on the main...

`bitsandbytes` - `Linear8bitLt` integration into `transformers` models

I think before merging we need: - [x] Memory footprint benchmarking - [x] Infrence speed benchmarking - [x] `lm-eval` benchmarking for large models (it has been done for small models)...

`bitsandbytes` - `Linear8bitLt` integration into `transformers` models

Added another PR to support int8 quantization + `accelerate` on multi-GPU setup here: https://github.com/huggingface/accelerate/pull/539 !

`bitsandbytes` - `Linear8bitLt` integration into `transformers` models

Thanks @sgugger for your review ! Fixed the suggestions ;) I think that we are good to go to merge https://github.com/huggingface/accelerate/pull/539 if you don't mind 🙏 I just need to...

`bitsandbytes` - `Linear8bitLt` integration into `transformers` models

TODOs: - [x] Have a working colab demo for inference - [x] Add more documentation - [x] Implement tests

`bitsandbytes` - `Linear8bitLt` integration into `transformers` models

Before moving forward, I would like to have a comment from @michaelbenayoun @mfuntowicz and @echarlaix ## About this PR We replace all the `nn.Linear` modules by the `bnb.Linear8bitLt` modules from...

`bitsandbytes` - `Linear8bitLt` integration into `transformers` models

Can confirm the slow tests that I have designed are passing on my testing machine (2x Tesla T4 15GB). But for now it is not possible to load saved int8...

`bitsandbytes` - `Linear8bitLt` integration into `transformers` models

Hi @cnbeining ! Thanks for your interest in this feature and happy to see that you are already excited to run it on Codegen! 🚀 Initially your problem is related...