TTS icon indicating copy to clipboard operation
TTS copied to clipboard

[Bug] bark ignores custom voice sample for cloning, generates output using random voices instead

Open aroslanov opened this issue 1 year ago β€’ 3 comments

Describe the bug

Bark ignores given voice sample in Windows 11. It sees wav reference, successfully creates npz, but then it generates output file with totally random voices. Same issue if called from either TTS.api or TTS.tts.models.bark.

To Reproduce

from TTS.api import TTS
tts = TTS("tts_models/multilingual/multi-dataset/bark", gpu=True)
tts.tts_to_file(text="Hello, my name is Manmay , how are you?",
                file_path="bark.wav",
                voice_dir=r"D:\voices",
                speaker="custom_speaker")

Expected behavior

No response

Logs

No response

Environment

{
    "CUDA": {
        "GPU": [
            "NVIDIA GeForce RTX 4080"
        ],
        "available": true,
        "version": "12.1"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.2.0+cu121",
        "TTS": "0.22.0",
        "numpy": "1.22.0"
    },
    "System": {
        "OS": "Windows",
        "architecture": [
            "64bit",
            "WindowsPE"
        ],
        "processor": "AMD64 Family 25 Model 97 Stepping 2, AuthenticAMD",
        "python": "3.10.11",
        "version": "10.0.22631"
    }
}

Additional context

No response

aroslanov avatar Feb 11 '24 01:02 aroslanov

Same here, you see any solvs?

BIOVPEPPER avatar Feb 17 '24 14:02 BIOVPEPPER

Same problem

changjinhan avatar Apr 01 '24 01:04 changjinhan

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

stale[bot] avatar May 11 '24 23:05 stale[bot]