chatterbox icon indicating copy to clipboard operation
chatterbox copied to clipboard

Too slow for realtime, tips on speeding it up

Open websines opened this issue 7 months ago • 8 comments

Getting a 2.75x realtime on 3090, too slow for realtime. Any tips, willing to edit or retrain model if needed. I have made a streaming engine for it as well as better voice cloning via audio normalization and more. But to make it usable it needs to be way faster, any papers on it or any tips to speed it up in general. Dynamic quantization didn't quite work.

websines avatar May 29 '25 21:05 websines

check my openai api its a few posts down there .. should be faster then that .. i run on a6000 .. that is slower then a 3090 ..

darkacorn avatar May 29 '25 21:05 darkacorn

Move the backend inf code to vllm or trt-llm instead of basic hf transformer inf (lots more work)

alpha-adam avatar May 30 '25 00:05 alpha-adam

Use streaming https://github.com/davidbrowne17/chatterbox-streaming ;)

davidbrowne17 avatar May 30 '25 00:05 davidbrowne17

@davidbrowne17 you're a legend

alpha-adam avatar May 30 '25 01:05 alpha-adam

@davidbrowne17 still very slow for me, 2.8x realtime on a 3090

websines avatar May 30 '25 06:05 websines

The README advertise their paid service for ultra-low latency of sub 200ms—ideal for production use in agents, applications, or interactive media.

kth8 avatar May 30 '25 21:05 kth8

Can anyone confirm if adding chatterbox support to fastrtc would handle the speed issue, since the streaming can be handled in fastrtc?

talha-iqbal-mergestack avatar Jun 11 '25 14:06 talha-iqbal-mergestack

I'm confused. You are getting 2.75x realtime. Meaning for every minute of conversion you get 2.75 minutes of audio and that's bad?

danneauxs avatar Sep 24 '25 00:09 danneauxs