Mozer

Results 2 comments of Mozer

> It's been some days but stil can't find quantized version of the 7b model. I found the result of 7b is more supperior but the generation time is killing...

I replaced HF-transformers LLM engine with exllamav3 LLM engine. It gave me x3 speed-up for the LLM part. Overall speed for vibevoice-7b on my 3090 is now 9 it/s. And...