Technotech comments

Results 5 comments of


                                            Technotech

trafficstars

3-bit and 2-bit GPTQ support

> Going to 3 bits alone is a big drop in quality If a 13B model at 3-bit could fit into VRAM, it'd probably still be better than a 7B...

3-bit and 2-bit GPTQ support

> @TechnotechYT 13B at 3-bit will probably perform better than 7B at 4-bit, yes. I think it would still be a tight squeeze in 8 GB, especially if you have...

Text to speech models in GGML?

While I'm not an expert by any means, VITS in CoquiTTS is almost realtime on CPU (I tested on a medium range laptop CPU). With ggml and a good quant...

Support for llama.cpp or exl2

I've been working on trying to support llama.cpp via llama-cpp-python; I will see if I can make a PR soon (working through it, attention masks unsolved as llama.cpp has them,...

Support for llama.cpp or exl2

@iofu728 Just to update, the latest bug is with the logits. I'm not very experienced with low level PyTorch, so my guess is that this line is to focus on...