Mozer
Results
2
comments of
Mozer
> It's been some days but stil can't find quantized version of the 7b model. I found the result of 7b is more supperior but the generation time is killing...
I replaced HF-transformers LLM engine with exllamav3 LLM engine. It gave me x3 speed-up for the LLM part. Overall speed for vibevoice-7b on my 3090 is now 9 it/s. And...