so-vits-svc-fork
so-vits-svc-fork copied to clipboard
Split chunk if it's too long
When inferencing files that have long prolonged sounds, it causes CUDA Out of Memory error because the chunks are too long.
If the chunk is very large, split it into two or more smaller chunks and process them instead. It would probably cause a gap or a click sound between two chunks then maybe do a bit of cross-fading.
I run into this issue as well as wasn't sure if it was my code or the model or the gpu just acting flaky. Can we get confirmation that it is indeed best practice to split the target audio into chunks, perform inference on each chunk, and then concat them?
Try increasing db_thresh (decrease absolute value)
Try increasing db_thresh (decrease absolute value)
Doesn't help with this specific sample: https://voca.ro/1aYDeSpvkIri The silence threshold was at -35. If I set it at -28 then it just makes a lot of gaps in the notes. Here's the OOM log: log.txt
Also is it okay that the VRAM stays high after inference is done? When I launch svcg, the VRAM is at 0.3 GB, but after inference is done it's at 10 GB.