soft-vc icon indicating copy to clipboard operation
soft-vc copied to clipboard

skipped phonemes in generated audio

Open thivux opened this issue 1 year ago • 0 comments

hi, thank you for sharing your code.

i am trying to do voice conversion from English speech to Vietnamese speaker. to do that, i did the following steps

  • extract units for both English and Vietnamese dataset
  • train kmeans on both types of units & extract discrete labels
  • train soft encoder
  • extract soft units
  • train acoustic model
  • train hifigan on Vietnamese dataset

the output for Vietnamese speech (input audio is Vietnamese, of a different speaker) is okay. but output for English is not that good. phonemes are often skipped or mispronouced. do you have any suggestions on how i can improve the results?

thivux avatar Feb 26 '24 05:02 thivux