tortoise-tts icon indicating copy to clipboard operation
tortoise-tts copied to clipboard

Int 4 quantization?

Open fakerybakery opened this issue 2 years ago • 4 comments

Hello, I'm relatively new here, but I was wondering if 4-bit quantization is possible to speed up inference. I know that tortoise-tts-fast allows half precision, but I was wondering if it is possible to further quantize the models. Thank you

fakerybakery avatar Jul 18 '23 19:07 fakerybakery

I'm also interested in whether this works!

neonbjb avatar Jul 19 '23 04:07 neonbjb

I wonder if @ggerganov could make tortoise.cpp?

fakerybakery avatar Oct 02 '23 23:10 fakerybakery

Related to https://github.com/ggerganov/ggml/issues/59

thiswillbeyourgithub avatar Nov 19 '23 14:11 thiswillbeyourgithub

https://github.com/balisujohn/tortoise.cpp

tortoise.cpp implementation underway :) contributions are welcome!

balisujohn avatar Dec 25 '23 06:12 balisujohn