TTS
TTS copied to clipboard
[Bug] tts-server: TypeError: expected str, bytes or os.PathLike object, not NoneType
Describe the bug
Hello. When I try to use tts-server with xtts I am receiving an error. I can run tts-server with vits and tacotron and it works.
To Reproduce
$ tts-server --model_name tts_models/multilingual/multi-dataset/xtts_v1.1
Expected behavior
Server starts without error
Logs
$ tts-server --model_name tts_models/multilingual/multi-dataset/xtts_v1.1
> tts_models/multilingual/multi-dataset/xtts_v1.1 is already downloaded.
Traceback (most recent call last):
File "/opt/homebrew/bin/tts-server", line 5, in <module>
from TTS.server.server import main
File "/opt/homebrew/lib/python3.11/site-packages/TTS/server/server.py", line 104, in <module>
synthesizer = Synthesizer(
^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/TTS/utils/synthesizer.py", line 93, in __init__
self._load_tts(tts_checkpoint, tts_config_path, use_cuda)
File "/opt/homebrew/lib/python3.11/site-packages/TTS/utils/synthesizer.py", line 183, in _load_tts
self.tts_config = load_config(tts_config_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/TTS/config/__init__.py", line 85, in load_config
ext = os.path.splitext(config_path)[1]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen posixpath>", line 118, in splitext
Environment
{
"CUDA": {
"GPU": [],
"available": false,
"version": null
},
"Packages": {
"PyTorch_debug": false,
"PyTorch_version": "2.2.0.dev20231018",
"TTS": "0.19.1",
"numpy": "1.24.3"
},
"System": {
"OS": "Darwin",
"architecture": [
"64bit",
""
],
"processor": "arm",
"python": "3.11.6",
"version": "Darwin Kernel Version 22.6.0: Fri Sep 15 13:41:30 PDT 2023; root:xnu-8796.141.3.700.8~1/RELEASE_ARM64_T8103"
}
}
Additional context
No response
I've got the same error when starting with xtts v2, running on MacOS with an M2 chip.
tts-server --model_name tts_models/multilingual/multi-dataset/xtts_v2
> tts_models/multilingual/multi-dataset/xtts_v2 is already downloaded.
Traceback (most recent call last):
File "/Users/david/mambaforge/bin/tts-server", line 5, in <module>
from TTS.server.server import main
File "/Users/david/mambaforge/lib/python3.10/site-packages/TTS/server/server.py", line 104, in <module>
synthesizer = Synthesizer(
File "/Users/david/mambaforge/lib/python3.10/site-packages/TTS/utils/synthesizer.py", line 93, in __init__
self._load_tts(tts_checkpoint, tts_config_path, use_cuda)
File "/Users/david/mambaforge/lib/python3.10/site-packages/TTS/utils/synthesizer.py", line 183, in _load_tts
self.tts_config = load_config(tts_config_path)
File "/Users/david/mambaforge/lib/python3.10/site-packages/TTS/config/__init__.py", line 85, in load_config
ext = os.path.splitext(config_path)[1]
File "/Users/david/mambaforge/lib/python3.10/posixpath.py", line 118, in splitext
p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not NoneType
same issue here:
(youtube-machine) ian-coding@Ians-MacBook-Pro youtube-machine % tts-server --model_name tts_models/multilingual/multi-dataset/xtts_v2
> You must agree to the terms of service to use this model.
| > Please see the terms of service at https://coqui.ai/cpml.txt
| > "I have read, understood and agreed to the Terms and Conditions." - [y/n]
| | > y
> Downloading model to /Users/ian-coding/Library/Application Support/tts/tts_models--multilingual--multi-dataset--xtts_v2
> Model's license - CPML
> Check https://coqui.ai/cpml.txt for more info.
Traceback (most recent call last):
File "/Users/ian-coding/anaconda3/envs/youtube-machine/bin/tts-server", line 5, in <module>
from TTS.server.server import main
File "/Users/ian-coding/anaconda3/envs/youtube-machine/lib/python3.10/site-packages/TTS/server/server.py", line 104, in <module>
synthesizer = Synthesizer(
File "/Users/ian-coding/anaconda3/envs/youtube-machine/lib/python3.10/site-packages/TTS/utils/synthesizer.py", line 93, in __init__
self._load_tts(tts_checkpoint, tts_config_path, use_cuda)
File "/Users/ian-coding/anaconda3/envs/youtube-machine/lib/python3.10/site-packages/TTS/utils/synthesizer.py", line 183, in _load_tts
self.tts_config = load_config(tts_config_path)
File "/Users/ian-coding/anaconda3/envs/youtube-machine/lib/python3.10/site-packages/TTS/config/__init__.py", line 85, in load_config
ext = os.path.splitext(config_path)[1]
File "/Users/ian-coding/anaconda3/envs/youtube-machine/lib/python3.10/posixpath.py", line 118, in splitext
p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not NoneType
fwiw when I look at the folder i DO see a config file... however i do get an error
Server does not support XTTS currently :(
I've managed to get the tts-server working with xtts_v2 model, and also use speaker.wav so you can clone a voice.
Command to use to point to your xtts_v2 model-
python server.py --use_cuda true --model_path C:\Users\bob\AppData\Local\tts\tts_models--multilingual--multi-dataset--xtts_v2 --config_path C:\Users\bob\AppData\Local\tts\tts_models--multilingual--multi-dataset--xtts_v2\config.json
Note that, weirdly, the model_path does not include the actual model file, but the config path includes the config file. I think this is the error that this bug refers to: it needs the config file to be directly referenced.
But there were some more changes I made to get things working.
- Fix "AssertionError: Language is not supported." and "TypeError: Invalid file: None" It seems that xtts is multilingual so needs a language, which the UI does not set, so I have hardcoded it to "en" here. And also xtts requires speaker.wav, which is also not set, so again I hardcoded that. In the future I plan to send this along with the request.
server.py [Line 203-ish] -
def tts():
with lock:
text = request.headers.get("text") or request.values.get("text", "")
speaker_idx = request.headers.get("speaker-id") or request.values.get("speaker_id", "")
language_idx = request.headers.get("language-id") or request.values.get("language_id", "")
style_wav = request.headers.get("style-wav") or request.values.get("style_wav", "")
style_wav = style_wav_uri_to_dict(style_wav)
print(f" > Model input: {text}")
print(f" > Speaker Idx: {speaker_idx}")
print(f" > Language Idx: {language_idx}")
#CHANGE THIS LINE: wavs = synthesizer.tts(text, speaker_name=speaker_idx, language_name=language_idx, style_wav=style_wav)
#TO THIS-
wavs = synthesizer.tts(text, language_name="en", style_wav=style_wav, speaker_wav="YOUR_SPEAKER_WAV.wav")
out = io.BytesIO()
synthesizer.save_wav(wavs, out)
return send_file(out, mimetype="audio/wav")
- Fix error "The following
model_kwargs
are not used by the model: ['speaker_id'] " It seems that xtts does not like speaker_id for some reason.
synthesizer.py [Line 384-ish] -
if hasattr(self.tts_model, "synthesize"):
outputs = self.tts_model.synthesize(
text=sen,
config=self.tts_config,
#COMMENT_THIS_OUT speaker_id=speaker_name,
voice_dirs=self.voice_dir,
d_vector=speaker_embedding,
speaker_wav=speaker_wav,
language=language_name,
**kwargs,
)
That's hopefully everything.
@MooTheCow nice. Maybe send a PR?
Please fix someone.
@MooTheCow nice. Maybe send a PR?
I think I'll send the PR Here are step-by-step solution:
Issue: config_path
Fails to Load
The primary issue identified is the failure in loading config_path
.
The issue stems from incompatible versions of xtts not being adequately handled in manage.py
. Refer to the relevant code section here:
https://github.com/coqui-ai/TTS/blob/5dcc16d1931538e5bce7cb20c1986df371ee8cd6/TTS/utils/manage.py#L416-L419
A workaround involves replacing "xtts"
with "xtts1"
, which allows the process to initiate normally
However, this modification leads to the following error after a certain duration:
# Error Traceback
Traceback (most recent call last):
...
NotADirectoryError: [Errno 20] Not a directory: '/home/kreker/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2/model.pth/model.pth'
Process finished with exit code 1
Attempted Resolution for model.pth
NotADirectoryError
The underlying issue can be traced to the following code segment: https://github.com/coqui-ai/TTS/blob/5dcc16d1931538e5bce7cb20c1986df371ee8cd6/TTS/tts/models/xtts.py#L731-L735
This implementation does not conform to the BaseTrainerModel
interface, as seen here:
https://github.com/coqui-ai/TTS/blob/5dcc16d1931538e5bce7cb20c1986df371ee8cd6/TTS/model.py#L46-L49
In the Xttr
implementation, the order of checkpoint_dir
and checkpoint_path
parameters is incorrect. The proposed solution involves swapping these two parameters. Additionally, the following line should be added at the beginning of the load_checkpoint
method:
if checkpoint_dir is None and checkpoint_path:
checkpoint_dir = os.path.dirname(checkpoint_path)
Internal server error
[2023-12-29 23:46:53,340] ERROR in app: Exception on /api/tts [GET]
Traceback (most recent call last):
File "/home/kreker/Python/Projects/TTS/lib/python3.10/site-packages/flask/app.py", line 1455, in wsgi_app
response = self.full_dispatch_request()
File "/home/kreker/Python/Projects/TTS/lib/python3.10/site-packages/flask/app.py", line 869, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/kreker/Python/Projects/TTS/lib/python3.10/site-packages/flask/app.py", line 867, in full_dispatch_request
rv = self.dispatch_request()
File "/home/kreker/Python/Projects/TTS/lib/python3.10/site-packages/flask/app.py", line 852, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "/home/kreker/Python/Fork/coqui/tts-server/TTS/TTS/server/server.py", line 203, in tts
wavs = synthesizer.tts(text, speaker_name=speaker_idx, language_name=language_idx, style_wav=style_wav)
File "/home/kreker/Python/Fork/coqui/tts-server/TTS/TTS/utils/synthesizer.py", line 322, in tts
raise ValueError(
ValueError: [!] Looks like you are using a multi-speaker model. You need to define either a `speaker_idx` or a `speaker_wav` to use a multi-speaker model.
::ffff:127.0.0.1 - - [29/Dec/2023 23:46:53] "GET /api/tts?text=test&speaker_id=&style_wav=&language_id= HTTP/1.1" 500 -
Still WIP. Probably I should add some language selection into web UI.
Please note that my analysis might be subject to inaccuracies and further review is advised.
Workaround
Pass actual filesystem path to model & model's config instead of passing model's name
Example
$ python3 TTS/server/server.py \
--model_path ~/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2 \
--config_path ~/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2/config.json
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.
Hello fellow developers,
I wanted to share my experience with running the Coqui AI TTS model on Docker, hoping it might be helpful to someone else who encounters similar issues.
System Configuration: I'm using a MacBook Pro M3 Max with 32GB of RAM and Unity memory.
Here are the steps I took:
- Configure Docker Desktop Settings: Allow maximum capabilities for your Docker instance by going to its settings inside Docker Desktop + enable Rosetta for x86_64/amd64 emulation on Apple Silicon.
- Run Docker with Linux/AMD64 Platform: Use the following command:
docker run --rm -it -p 5002:5002 --platform linux/amd64 --entrypoint /bin/bash ghcr.io/coqui-ai/tts
- Download the Model Inside the Docker Container:
python3 TTS/server/server.py --model_name tts_models/multilingual/multi-dataset/xtts_v2
This step resulted in an error: "Expected str, bytes or os.PathLike object".
- Run the Model with Model Path and Config Path: Use the following command like @axxapy suggest above:
python3 TTS/server/server.py \
--model_path ~/.local/share/tts/tts_models/multilingual/multi-dataset/xtts_v2 \
--config_path ~/.local/share/tts/tts_models/multilingual/multi-dataset/xtts_v2/config.json
Nvidia lovers - tested on V100 GPU :
docker run --rm -it -p 5002:5002 --platform linux/amd64 --entrypoint /bin/bash ghcr.io/coqui-ai/tts
python3 TTS/server/server.py --model_name tts_models/multilingual/multi-dataset/xtts_v2 --use_cuda true
python3 TTS/server/server.py \
--model_path ~/.local/share/tts/tts_models/multilingual/multi-dataset/xtts_v2 \
--config_path ~/.local/share/tts/tts_models/multilingual/multi-dataset/xtts_v2/config.json --use_cuda true
I hope this helps! If you encounter any issues or have further questions, feel free to ask.
Hi Team,
I am also having this issue with various models however my report here is using Bark and I have tried specifying the config path directly:
root@f1d0694c075b:/# find -name config.json
./root/.local/share/tts/tts_models--multilingual--multi-dataset--bark/config.json
./root/TTS/vc/modules/freevc/wavlm/config.json
root@f1d0694c075b:/# python3 /root/TTS/server/server.py --model_name tts_models/multilingual/multi-dataset/bark --use_cuda true --config_path ./root/.local/share/tts/tts_models--multilingual--multi-dataset--bark/config.json
> tts_models/multilingual/multi-dataset/bark is already downloaded.
Traceback (most recent call last):
File "/root/TTS/server/server.py", line 104, in <module>
synthesizer = Synthesizer(
File "/root/TTS/utils/synthesizer.py", line 93, in __init__
self._load_tts(tts_checkpoint, tts_config_path, use_cuda)
File "/root/TTS/utils/synthesizer.py", line 183, in _load_tts
self.tts_config = load_config(tts_config_path)
File "/root/TTS/config/__init__.py", line 82, in load_config
ext = os.path.splitext(config_path)[1]
File "/usr/lib/python3.10/posixpath.py", line 118, in splitext
p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not NoneType
Secondly I tried the suggestion of specifying both config and model paths:
root@f1d0694c075b:/# python3 /root/TTS/server/server.py --model_path ./root/.local/share/tts/tts_models--multilingual--multi-dataset--bark --use_cuda true --config_path ./root/.local/share/tts/tts_models--multilingual--multi-dataset--bark/config.json
> Using model: bark
/usr/local/lib/python3.10/dist-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
WARNING:TTS.tts.layers.bark.load_model:found outdated text model, removing...
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 17.2k/17.2k [00:00<00:00, 30.3MiB/s]
Traceback (most recent call last):
File "/root/TTS/server/server.py", line 104, in <module>
synthesizer = Synthesizer(
File "/root/TTS/utils/synthesizer.py", line 93, in __init__
self._load_tts(tts_checkpoint, tts_config_path, use_cuda)
File "/root/TTS/utils/synthesizer.py", line 192, in _load_tts
self.tts_model.load_checkpoint(self.tts_config, tts_checkpoint, eval=True)
File "/root/TTS/tts/models/bark.py", line 281, in load_checkpoint
self.load_bark_models()
File "/root/TTS/tts/models/bark.py", line 50, in load_bark_models
self.semantic_model, self.config = load_model(
File "/root/TTS/tts/layers/bark/load_model.py", line 121, in load_model
checkpoint = torch.load(ckpt_path, map_location=device)
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1028, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1246, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '<'.
@cmurrayis The _pickle.UnpicklingError: invalid load key, '<'.
error is due to a bug with Bark. This is fixed in our fork, available via pip install coqui-tts