TTS Error reported during fine-tuning inference

Describe the bug

tts --text "Text for TTS" --model_path /home/TTS-dev/recipes/ljspeech/xtts_v2/run/training/GPT_XTTS_v2.0_LJSpeech_FT-February-14-2024_05+17AM-0000000 --config_path /home/TTS-dev/recipes/ljspeech/xtts_v2/run/training/GPT_XTTS_v2.0_LJSpeech_FT-February-14-2024_05+17AM-0000000/config.json --out_path output02.wav

Using model: xtts Text: Text for TTS Text splitted to sentences. ['Text for TTS'] Traceback (most recent call last): File "/opt/conda/bin/tts", line 8, in sys.exit(main()) File "/home/TTS-dev/TTS/bin/synthesize.py", line 468, in main wav = synthesizer.tts( File "/home/TTS-dev/TTS/utils/synthesizer.py", line 386, in tts outputs = self.tts_model.synthesize( File "/home/TTS-dev/TTS/tts/models/xtts.py", line 399, in synthesize "zh-cn" if language == "zh" else language in self.config.languages#"zh-cn" if language == "zh" else language in self.config.languages AssertionError: ❗ Language None is not supported. Supported languages are ['en', 'es', 'fr', 'de', 'it', 'pt', 'pl', 'tr', 'ru', 'nl', 'cs', 'ar', 'zh-cn', 'hu', 'ko', 'ja', 'hi']

To Reproduce

tts --text "Text for TTS" --model_path /home/TTS-dev/recipes/ljspeech/xtts_v2/run/training/GPT_XTTS_v2.0_LJSpeech_FT-February-14-2024_05+17AM-0000000 --config_path /home/TTS-dev/recipes/ljspeech/xtts_v2/run/training/GPT_XTTS_v2.0_LJSpeech_FT-February-14-2024_05+17AM-0000000/config.json --out_path output02.wav

Expected behavior

No response

Logs

No response

Environment

docker

Additional context

No response

Feb 15 '24 10:02 fanghaiquan1

tts --text "Text for TTS" --model_path /home/TTS-dev/recipes/ljspeech/xtts_v2/run/training/GPT_XTTS_v2.0_LJSpeech_FT-February-14-2024_05+17AM-0000000 --config_path /home/TTS-dev/recipes/ljspeech/xtts_v2/run/training/GPT_XTTS_v2.0_LJSpeech_FT-February-14-2024_05+17AM-0000000/config.json --language_idx en --out_path output02.wav

Using model: xtts Text: Text for TTS Text splitted to sentences. ['Text for TTS'] Traceback (most recent call last): File "/opt/conda/bin/tts", line 8, in sys.exit(main()) File "/home/TTS-dev/TTS/bin/synthesize.py", line 468, in main wav = synthesizer.tts( File "/home/TTS-dev/TTS/utils/synthesizer.py", line 386, in tts outputs = self.tts_model.synthesize( File "/home/TTS-dev/TTS/tts/models/xtts.py", line 419, in synthesize return self.full_inference(text, speaker_wav, language, **settings) File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/TTS-dev/TTS/tts/models/xtts.py", line 480, in full_inference (gpt_cond_latent, speaker_embedding) = self.get_conditioning_latents( File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/TTS-dev/TTS/tts/models/xtts.py", line 357, in get_conditioning_latents audio = load_audio(file_path, load_sr) File "/home/TTS-dev/TTS/tts/models/xtts.py", line 73, in load_audio audio, lsr = torchaudio.load(audiopath) File "/opt/conda/lib/python3.10/site-packages/torchaudio/_backend/utils.py", line 204, in load return backend.load(uri, frame_offset, num_frames, normalize, channels_first, format, buffer_size) File "/opt/conda/lib/python3.10/site-packages/torchaudio/_backend/ffmpeg.py", line 336, in load return load_audio(os.path.normpath(uri), frame_offset, num_frames, normalize, channels_first, format) File "/opt/conda/lib/python3.10/posixpath.py", line 340, in normpath path = os.fspath(path) TypeError: expected str, bytes or os.PathLike object, not NoneType

Feb 15 '24 10:02 fanghaiquan1

At least for me specifying a speaker_wav (i.e. --speaker_wav <filepath to reference>) worked. I used the same reference as I did during training.

Feb 18 '24 12:02 sorgfresser

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

Apr 22 '24 05:04 stale[bot]

TTS TTS copied to clipboard

Error reported during fine-tuning inference

Describe the bug

To Reproduce

Expected behavior

Logs

Environment

Additional context

TTS
TTS copied to clipboard