TTS [Bug] Multiple speaker requests?

Describe the bug

The TTS API states that speaker_wav can be a list of filepaths for multiple speaker references. But in def tts_to_file(...), speaker_wav only accepts a single string.

To Reproduce

tts.tts_to_file(
    text="Some test",
    file_path="output.wav",
    speaker_wav=["training/1.wav"],
    language="en",
)

Expected behavior

No response

Logs

No response

Environment

{
    "CUDA": {
        "GPU": [],
        "available": false,
        "version": null
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.1.1",
        "TTS": "0.20.6",
        "numpy": "1.26.2"
    },
    "System": {
        "OS": "Darwin",
        "architecture": [
            "64bit",
            ""
        ],
        "processor": "arm",
        "python": "3.11.6",
        "version": "Darwin Kernel Version 22.5.0: Thu Jun  8 22:22:20 PDT 2023; root:xnu-8796.121.3~7/RELEASE_ARM64_T6000"
    }
}

Additional context

No response

Nov 20 '23 18:11 mukundt

tried giving a list of strings?

Nov 28 '23 10:11 erogol

It works for me in this way:

tts.tts_to_file(
    text="Some test",
    file_path="output.wav",
    speaker_wav=["training/1.wav","training/2.wav","training/3.wav"],
    language="en",
)

But the order is important becase the main timbre will be the first wav in the list... it's like the other ones add features to the first voice...

What I wish to know is how to SAVE the "speaker" once I find the right list, to save time so it does not have to analyze it every time.

Nov 30 '23 20:11 Zibri

I was able to do successfully generate audio using both a list of strings with length of 1 and 2.


from TTS.api import TTS


tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2", gpu=True)

tts.tts_to_file(
    text="Some test text",
    file_path="output.wav",
    speaker_wav=["test_wavs/1.wav", "test_wavs/2.wav"],
    language="en",
)



tts.tts_to_file(
    text="Some test text",
    file_path="output.wav",
    speaker_wav=["test_wavs/1.wav"],
    language="en",
)

try upgrading to TTS version 0.21.2

Nov 30 '23 21:11 loganhart02

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

Jan 02 '24 09:01 stale[bot]

TTS TTS copied to clipboard

[Bug] Multiple speaker requests?

Describe the bug

To Reproduce

Expected behavior

Logs

Environment

Additional context

TTS
TTS copied to clipboard