Retrieval-based-Voice-Conversion-WebUI icon indicating copy to clipboard operation
Retrieval-based-Voice-Conversion-WebUI copied to clipboard

Unnatural pitch gliding in the converted voice

Open TheTrustedComputer opened this issue 1 year ago • 2 comments

Describe the bug I'm unsure if this is a feature or a bug, but I've heard these strange pitch glides in RVC models with pitch guidance enabled during inference; it also affects real-time. Interestingly, this quirk is virtually absent in the RVC-Boss's repo.

To Reproduce

  1. Open the web GUI.
  2. Select any RVC model.
  3. Create an input audio containing tones with a large pitch difference and convert.
  4. Listen to the output audio; you'll hear them.

Expected behavior The converted voice doesn't exhibit pitch-gliding behavior. Or rather, its gliding behavior is minimized.

Screenshots Not applicable.

Desktop (please complete the following information):

  • OS and version: Arch Linux (version N/A)
  • Python version: 3.10.13
  • Commit/Tag with the issue: Latest

Additional context All pitch detection algorithms except FCPE sounded poorly at tracking them with RMVPE being the worst offender. It appears the hop length was set too high for my comfort level.

TheTrustedComputer avatar Jul 19 '24 14:07 TheTrustedComputer

Maybe it is because this repo used an unified F0 extracting class, which introduced the interpolation behavior, but I'm not 100% sure. If you don't mind, could you provide the audio you use for our testing? Thanks!

fumiama avatar Jul 24 '24 08:07 fumiama

https://drive.google.com/file/d/1EufMz1hMB5M_HMJoIuS7pB9wh8aslnyE/view The audio contains sawtooth waves at 120 Hz and 960 Hz for half a second, respectively.

TheTrustedComputer avatar Jul 25 '24 04:07 TheTrustedComputer