Retrieval-based-Voice-Conversion-WebUI
Retrieval-based-Voice-Conversion-WebUI copied to clipboard
Unnatural pitch gliding in the converted voice
Describe the bug I'm unsure if this is a feature or a bug, but I've heard these strange pitch glides in RVC models with pitch guidance enabled during inference; it also affects real-time. Interestingly, this quirk is virtually absent in the RVC-Boss's repo.
To Reproduce
- Open the web GUI.
- Select any RVC model.
- Create an input audio containing tones with a large pitch difference and convert.
- Listen to the output audio; you'll hear them.
Expected behavior The converted voice doesn't exhibit pitch-gliding behavior. Or rather, its gliding behavior is minimized.
Screenshots Not applicable.
Desktop (please complete the following information):
- OS and version: Arch Linux (version N/A)
- Python version: 3.10.13
- Commit/Tag with the issue: Latest
Additional context All pitch detection algorithms except FCPE sounded poorly at tracking them with RMVPE being the worst offender. It appears the hop length was set too high for my comfort level.
Maybe it is because this repo used an unified F0 extracting class, which introduced the interpolation behavior, but I'm not 100% sure. If you don't mind, could you provide the audio you use for our testing? Thanks!
https://drive.google.com/file/d/1EufMz1hMB5M_HMJoIuS7pB9wh8aslnyE/view The audio contains sawtooth waves at 120 Hz and 960 Hz for half a second, respectively.