TTS
TTS copied to clipboard
[Bug] Hoarseness in Higher-Pitched Female Voices with xtts-v2 after finetune
Describe the bug
When generating higher-pitched female voices after fine-tuning the xtts-v2 model, there is a noticeable hoarseness, resembling the strain one might experience when trying to reach high musical notes.
abnormal example: https://mork.ro/NQjFi
normal example: https://mork.ro/3iZ8Q#
Two voices generated from the same model, using different audio prompts.
To Reproduce
infer
Expected behavior
No response
Logs
No response
Environment
{
"CUDA": {
"GPU": [
"NVIDIA GeForce RTX 4090"
],
"available": true,
"version": "12.1"
},
"Packages": {
"PyTorch_debug": false,
"PyTorch_version": "2.1.1+cu121",
"TTS": "0.22.0",
"numpy": "1.22.0"
},
"System": {
"OS": "Linux",
"architecture": [
"64bit",
"ELF"
],
"processor": "x86_64",
"python": "3.10.13",
"version": "#202310061235~1697396945~22.04~9283e32 SMP PREEMPT_DYNAMIC Sun O"
}
}
Additional context
No response