VoxCPM icon indicating copy to clipboard operation
VoxCPM copied to clipboard

Tags for speech

Open tensorjackal opened this issue 1 month ago • 3 comments

Hey! Does the model supports <Laughter>, <Breath>, <Cry> etc tokens?

If not, how to add them in the model?

tensorjackal avatar Jan 02 '26 06:01 tensorjackal

Hey! Does the model supports , , etc tokens?

If not, how to add them in the model?

I can't see the tags you mentioned, but currently the model doesn't support any tags.

a710128 avatar Jan 12 '26 07:01 a710128

My bad! I meant laugh, breathe, cry like tags. If the model doesn't support them, how can we add them to it? Or maybe add instructions to model like: "Speak in an angry tone."

tensorjackal avatar Jan 12 '26 08:01 tensorjackal

My bad! I meant laugh, breathe, cry like tags. If the model doesn't support them, how can we add them to it? Or maybe add instructions to model like: "Speak in an angry tone."

The current model does not support these tags, but I think with some fine-tuning it might be able to. We are also working on providing support for these tags. If we succeed, we will release it in future versions.

a710128 avatar Jan 12 '26 09:01 a710128

Can you share how can we add instructions as well? I tried to append it before the prompt text, but it started bleeding it into the generated audio.

tensorjackal avatar Jan 26 '26 15:01 tensorjackal