so-vits-svc-fork icon indicating copy to clipboard operation
so-vits-svc-fork copied to clipboard

Split chunk if it's too long

Open Satisfy256 opened this issue 1 year ago • 3 comments

When inferencing files that have long prolonged sounds, it causes CUDA Out of Memory error because the chunks are too long.

If the chunk is very large, split it into two or more smaller chunks and process them instead. It would probably cause a gap or a click sound between two chunks then maybe do a bit of cross-fading.

Satisfy256 avatar Mar 29 '23 11:03 Satisfy256

I run into this issue as well as wasn't sure if it was my code or the model or the gpu just acting flaky. Can we get confirmation that it is indeed best practice to split the target audio into chunks, perform inference on each chunk, and then concat them?

chrislee973 avatar Mar 30 '23 07:03 chrislee973

Try increasing db_thresh (decrease absolute value)

34j avatar Apr 01 '23 10:04 34j

Try increasing db_thresh (decrease absolute value)

Doesn't help with this specific sample: https://voca.ro/1aYDeSpvkIri The silence threshold was at -35. If I set it at -28 then it just makes a lot of gaps in the notes. Here's the OOM log: log.txt

Also is it okay that the VRAM stays high after inference is done? When I launch svcg, the VRAM is at 0.3 GB, but after inference is done it's at 10 GB.

Satisfy256 avatar Apr 02 '23 19:04 Satisfy256