The Speaking Includes Background Music

Open fcabanski opened this issue 4 months ago • 2 comments

On the 1.5 model, there is background music with the generated speech. How can this be stopped?

Oct 14 '25 10:10 fcabanski

did you figure it out? I use demucs, its opensource and from meta. I use it to split the vocals from background musinc

Oct 16 '25 11:10 affibox

On the 1.5 model, there is background music with the generated speech. How can this be stopped?

This is an intended artifact from training to watermark the generated audio.

Some voices like the ones that are already in the demo folder of the community version are likely to generate it and it very often happens at the beggining/end meaning it can e cut out manually, otherwise, you'd need to separate it using demucs or the like.

The same goes for other random noises.

Oct 22 '25 14:10 mohbenaicha