Florian Zimmermeister

Results 69 comments of Florian Zimmermeister

![image](https://user-images.githubusercontent.com/47894090/236434934-f5a36527-6881-4045-9277-169828dc69ec.png) Am i missing something to get it working ? ``` { "model_config": { "model_id": "OpenAssistant/oasst-sft-7-llama-30b", "max_input_length": 1024, "max_total_length": 1792, "quantized": false }, "sampling_parameters": { "top_k": 50, "top_p": null, "typical_p":...

desc_act needs to be true, then it works as expected

https://github.com/Vahe1994/SpQR/issues/1

https://twitter.com/Tim_Dettmers/status/1676352492190433280?s=20 4bit will be 6.8x faster than before, so maybe we can discuss again about it if it is worth to replace the 8bit with 4bit after release of the...

https://twitter.com/Tim_Dettmers/status/1677826353457143808 Release tomorrow, faster than 16 bit @Narsil Open to discuss again ?

In the layers file is linear8bit, but I can't find where it is used Using the search I only find the load in 8 bit param of the from pretrained...

Hi Olivier, just for interest Did they told you any kind of information about the time ? It seems like the dedup PR of SPQR will be merged soon and...

https://github.com/Vahe1994/SpQR/pull/32 Just found that model saving is WIP

![image](https://github.com/huggingface/text-generation-inference/assets/47894090/1e52521c-1bfc-429d-aed8-348ee32647fb) The purple one is trained with the 3 line fix given in the colab

@iantbutler01 let me know if you need support at any point atm i am focused on training such models rather than integration into tgi