TTS icon indicating copy to clipboard operation
TTS copied to clipboard

Error reported during fine-tuning inference

Open fanghaiquan1 opened this issue 1 year ago β€’ 3 comments

Describe the bug

tts --text "Text for TTS" --model_path /home/TTS-dev/recipes/ljspeech/xtts_v2/run/training/GPT_XTTS_v2.0_LJSpeech_FT-February-14-2024_05+17AM-0000000 --config_path /home/TTS-dev/recipes/ljspeech/xtts_v2/run/training/GPT_XTTS_v2.0_LJSpeech_FT-February-14-2024_05+17AM-0000000/config.json --out_path output02.wav

Using model: xtts Text: Text for TTS Text splitted to sentences. ['Text for TTS'] Traceback (most recent call last): File "/opt/conda/bin/tts", line 8, in sys.exit(main()) File "/home/TTS-dev/TTS/bin/synthesize.py", line 468, in main wav = synthesizer.tts( File "/home/TTS-dev/TTS/utils/synthesizer.py", line 386, in tts outputs = self.tts_model.synthesize( File "/home/TTS-dev/TTS/tts/models/xtts.py", line 399, in synthesize "zh-cn" if language == "zh" else language in self.config.languages#"zh-cn" if language == "zh" else language in self.config.languages AssertionError: ❗ Language None is not supported. Supported languages are ['en', 'es', 'fr', 'de', 'it', 'pt', 'pl', 'tr', 'ru', 'nl', 'cs', 'ar', 'zh-cn', 'hu', 'ko', 'ja', 'hi']

To Reproduce

tts --text "Text for TTS" --model_path /home/TTS-dev/recipes/ljspeech/xtts_v2/run/training/GPT_XTTS_v2.0_LJSpeech_FT-February-14-2024_05+17AM-0000000 --config_path /home/TTS-dev/recipes/ljspeech/xtts_v2/run/training/GPT_XTTS_v2.0_LJSpeech_FT-February-14-2024_05+17AM-0000000/config.json --out_path output02.wav

Expected behavior

No response

Logs

No response

Environment

docker

Additional context

No response

fanghaiquan1 avatar Feb 15 '24 10:02 fanghaiquan1

tts --text "Text for TTS" --model_path /home/TTS-dev/recipes/ljspeech/xtts_v2/run/training/GPT_XTTS_v2.0_LJSpeech_FT-February-14-2024_05+17AM-0000000 --config_path /home/TTS-dev/recipes/ljspeech/xtts_v2/run/training/GPT_XTTS_v2.0_LJSpeech_FT-February-14-2024_05+17AM-0000000/config.json --language_idx en --out_path output02.wav

Using model: xtts Text: Text for TTS Text splitted to sentences. ['Text for TTS'] Traceback (most recent call last): File "/opt/conda/bin/tts", line 8, in sys.exit(main()) File "/home/TTS-dev/TTS/bin/synthesize.py", line 468, in main wav = synthesizer.tts( File "/home/TTS-dev/TTS/utils/synthesizer.py", line 386, in tts outputs = self.tts_model.synthesize( File "/home/TTS-dev/TTS/tts/models/xtts.py", line 419, in synthesize return self.full_inference(text, speaker_wav, language, **settings) File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/TTS-dev/TTS/tts/models/xtts.py", line 480, in full_inference (gpt_cond_latent, speaker_embedding) = self.get_conditioning_latents( File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/TTS-dev/TTS/tts/models/xtts.py", line 357, in get_conditioning_latents audio = load_audio(file_path, load_sr) File "/home/TTS-dev/TTS/tts/models/xtts.py", line 73, in load_audio audio, lsr = torchaudio.load(audiopath) File "/opt/conda/lib/python3.10/site-packages/torchaudio/_backend/utils.py", line 204, in load return backend.load(uri, frame_offset, num_frames, normalize, channels_first, format, buffer_size) File "/opt/conda/lib/python3.10/site-packages/torchaudio/_backend/ffmpeg.py", line 336, in load return load_audio(os.path.normpath(uri), frame_offset, num_frames, normalize, channels_first, format) File "/opt/conda/lib/python3.10/posixpath.py", line 340, in normpath path = os.fspath(path) TypeError: expected str, bytes or os.PathLike object, not NoneType

fanghaiquan1 avatar Feb 15 '24 10:02 fanghaiquan1

At least for me specifying a speaker_wav (i.e. --speaker_wav <filepath to reference>) worked. I used the same reference as I did during training.

sorgfresser avatar Feb 18 '24 12:02 sorgfresser

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

stale[bot] avatar Apr 22 '24 05:04 stale[bot]