chatterbox Too slow for realtime, tips on speeding it up

Getting a 2.75x realtime on 3090, too slow for realtime. Any tips, willing to edit or retrain model if needed. I have made a streaming engine for it as well as better voice cloning via audio normalization and more. But to make it usable it needs to be way faster, any papers on it or any tips to speed it up in general. Dynamic quantization didn't quite work.

May 29 '25 21:05 websines

check my openai api its a few posts down there .. should be faster then that .. i run on a6000 .. that is slower then a 3090 ..

May 29 '25 21:05 darkacorn

Move the backend inf code to vllm or trt-llm instead of basic hf transformer inf (lots more work)

May 30 '25 00:05 alpha-adam

Use streaming https://github.com/davidbrowne17/chatterbox-streaming ;)

May 30 '25 00:05 davidbrowne17

@davidbrowne17 you're a legend

May 30 '25 01:05 alpha-adam

@davidbrowne17 still very slow for me, 2.8x realtime on a 3090

May 30 '25 06:05 websines

The README advertise their paid service for ultra-low latency of sub 200ms—ideal for production use in agents, applications, or interactive media.

May 30 '25 21:05 kth8

Can anyone confirm if adding chatterbox support to fastrtc would handle the speed issue, since the streaming can be handled in fastrtc?

Jun 11 '25 14:06 talha-iqbal-mergestack

I'm confused. You are getting 2.75x realtime. Meaning for every minute of conversion you get 2.75 minutes of audio and that's bad?

Sep 24 '25 00:09 danneauxs