Retrieval-based-Voice-Conversion-WebUI icon indicating copy to clipboard operation
Retrieval-based-Voice-Conversion-WebUI copied to clipboard

[Feature request] Automatic batch processing of long files on harvest for limited RAM machines

Open kalomaze opened this issue 1 year ago • 4 comments

A common complaint from Google Colab users is harvest being unusable without manually splitting their song into parts. I wonder if an argument in the command to run python infer-web.py could be used to forcefully do it in batches so that it still exports without erroring out? If not that, splitting the song into ~30s intervals that cut off at silence and then stitching them together to serve a complete wav?

kalomaze avatar May 10 '23 15:05 kalomaze

decrease "Number of CPU threads to use for pitch extraction"

RVC-Boss avatar May 10 '23 16:05 RVC-Boss

This is specifically for inferencing. It will time out/error with a song longer than a minute and a half (on a Google Colab) Also, I notice some users who already 'split up' their inference file in parts to help increase the quality by having less to process at once. This could be helpful for automating that, and also avoiding the bug where parts of the audio will be very quiet after long silence periods in the vocals. The biggest problem, I'm guessing, is finding where silence begins and ends to make good 'cut' points.

kalomaze avatar May 10 '23 19:05 kalomaze

Soon to be working on an 'inference batcher script' soon on my fork at https://github.com/Mangio621/Mangio-RVC-Fork

image

Mangio621 avatar May 11 '23 11:05 Mangio621

can we also limit RAM usage when splitting audio ? for example limit RAM to 12 GB since colab free version limit is 12 GB

aracap avatar May 13 '23 05:05 aracap