Bahamut

Results 5 comments of Bahamut

``` Error: Could not load the specified mbrola voice file. Error: Could not load the specified mbrola voice file. RuntimeError: failed to load voice "ja" ``` When run DiffRhythm on...

> Japanese voicepacks from ja = jp, oh shi-- > located at `C:\Program Files\eSpeak NG\espeak-ng-data\mbrola`. > See [numediart/MBROLA#46 (comment)](https://github.com/numediart/MBROLA/issues/46#issuecomment-2701729556) for fix. Thx!!!

Inference speed double slow, when use q8_0 cache. 16 token/sec unquantized vs 8 token/sec with q8_0 kv cache. In mistral nemo this ~5% slower.

Up! Very sad bug: fat context, but quantized is kills inference speed. =(

> The problem is register pressure. Head size 256 needs more registers than head size 128 and a quantized KV cache also needs more registers than an FP16 KV cache....