Retrieval-based-Voice-Conversion-WebUI
Retrieval-based-Voice-Conversion-WebUI copied to clipboard
Easily train a good VC model with voice data <= 10 mins!
When I click Train Model, these lines appear on the cmd ``` write filelist done use gpus: python train_nsf_sim_cache_sid_load_pretrain.py -e riko -sr 40k -f0 1 -bs 1 -te 20 -se...
Setting 44100 as an option for the resampler causes only the first conversion to work. After that, it speeds up the audio to match sample rate requirements. Doesn't work as...
I was wondering if the batch processing feature on the interface processes all files at once by doing threading (since pm and harvest are on the primarily CPU), or each...
Hello! I notice the models use hop size of 10ms, but a lot of papers for SVS suggest 5ms. Is there a reason this hop size was used?
RVC导出的onnx模型在voice-changer(https://github.com/w-okada/voice-changer)中无法使用,报错,但VC直接使用pth又能够推理,但如果使用pth,又不能通过DirectML利用核显的算力了(测试用VC自带的onnx示例模型是可以使用核显推理的)。下面是VC报错内容: [Voice Changer] VC PROCESSING EXCEPTION!!! Required inputs (['phone', 'phone_lengths', 'ds', 'rnd']) are missing from input feed (['feats', 'p_len', 'pitch', 'pitchf', 'sid']). Traceback (most recent call last): File "voice_changer\VoiceChanger.py", line...
[timbremodel](http://www.timbremodel.com/#/index)
Getting this error in model inference. Produces mp3 file but no sound ``` 2023-06-29 16:07:06 | ERROR | root | Exception in callback _ProactorBasePipeTransport._call_connection_lost(None) handle: Traceback (most recent call last):...
Hello, I just trained on two speakers at the same time. The filelist looks like this: ``` /home/ubuntu/RVC-beta-v2-0528/logs/merged/0_gt_wavs/0_4_48.wav|/home/ubuntu/RVC-beta-v2-0528/logs/merged/3_feature768/0_4_48.npy|/home/ubuntu/RVC-beta-v2-0528/logs/merged/2a_f0/0_4_48.wav.npy|/home/ubuntu/RVC-beta-v2-0528/logs/merged/2b-f0nsf/0_4_48.wav.npy|0 /home/ubuntu/RVC-beta-v2-0528/logs/merged/0_gt_wavs/1_2_6.wav|/home/ubuntu/RVC-beta-v2-0528/logs/merged/3_feature768/1_2_6.npy|/home/ubuntu/RVC-beta-v2-0528/logs/merged/2a_f0/1_2_6.wav.npy|/home/ubuntu/RVC-beta-v2-0528/logs/merged/2b-f0nsf/1_2_6.wav.npy|1 ... ``` I have 184 samples of the first speaker, and...
Sometimes HUBERT mishears words (phonetics?) and transcribes them incorrectly. Is there a potential solution where you can manually write what is being fed when vocoding?