[Bug] 'GPT2InferenceModel' object has no attribute 'generate'
Describe the bug
uv run .\teste1.py Arquivo WAV: Sample rate=24000, Channels=1
tts_models/multilingual/multi-dataset/xtts_v2 is already downloaded. Using model: xtts GPT2InferenceModel has generative capabilities, as
prepare_inputs_for_generationis explicitly defined. However, it doesn't directly inherit fromGenerationMixin. From 👉v4.50👈 onwards,PreTrainedModelwill NOT inherit fromGenerationMixin, and this model will lose the ability to callgenerateand other related functions.
- If you're using
trust_remote_code=True, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes - If you are the owner of the model architecture code, please modify your model class such that it inherits from
GenerationMixin(afterPreTrainedModel, otherwise you'll get an exception). - If you are not the owner of the model architecture class, please contact the model code owner to update it.
Text splitted to sentences. ['esse é um teste de clonagem de voz'] Traceback (most recent call last): File "D:\bkp_hd\Projetos\python\TTS_02\teste1.py", line 22, in
tts.tts_to_file(text=text, speaker_wav=wav_path, language="pt", file_path=output_path) File "D:\bkp_hd\Projetos\python\TTS_02.venv\Lib\site-packages\TTS\api.py", line 334, in tts_to_file wav = self.tts( ^^^^^^^^^ File "D:\bkp_hd\Projetos\python\TTS_02.venv\Lib\site-packages\TTS\api.py", line 276, in tts wav = self.synthesizer.tts( ^^^^^^^^^^^^^^^^^^^^^ File "D:\bkp_hd\Projetos\python\TTS_02.venv\Lib\site-packages\TTS\utils\synthesizer.py", line 386, in tts outputs = self.tts_model.synthesize( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\bkp_hd\Projetos\python\TTS_02.venv\Lib\site-packages\TTS\tts\models\xtts.py", line 419, in synthesize return self.full_inference(text, speaker_wav, language, **settings) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\bkp_hd\Projetos\python\TTS_02.venv\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "D:\bkp_hd\Projetos\python\TTS_02.venv\Lib\site-packages\TTS\tts\models\xtts.py", line 488, in full_inference return self.inference( ^^^^^^^^^^^^^^^ File "D:\bkp_hd\Projetos\python\TTS_02.venv\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "D:\bkp_hd\Projetos\python\TTS_02.venv\Lib\site-packages\TTS\tts\models\xtts.py", line 541, in inference gpt_codes = self.gpt.generate( ^^^^^^^^^^^^^^^^^^ File "D:\bkp_hd\Projetos\python\TTS_02.venv\Lib\site-packages\TTS\tts\layers\xtts\gpt.py", line 590, in generate gen = self.gpt_inference.generate( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\bkp_hd\Projetos\python\TTS_02.venv\Lib\site-packages\torch\nn\modules\module.py", line 1940, in getattr raise AttributeError( AttributeError: 'GPT2InferenceModel' object has no attribute 'generate'
Atualização do modelo para suportar transformers 4.50+
To Reproduce
rom TTS.api import TTS from TTS.tts.configs.xtts_config import XttsConfig from TTS.tts.models.xtts import XttsAudioConfig, XttsArgs from TTS.config.shared_configs import BaseDatasetConfig import torch import soundfile as sf
Adiciona os globals à lista de permitidos
torch.serialization.add_safe_globals([XttsConfig, XttsAudioConfig, BaseDatasetConfig, XttsArgs])
Verificar arquivo WAV de referência
wav_path = "D:/bkp_hd/Projetos/python/TTS_02/temp/referencia.wav" data, sample_rate = sf.read(wav_path) print(f"Arquivo WAV: Sample rate={sample_rate}, Channels={data.shape[1] if data.ndim > 1 else 1}")
Inicializar modelo TTS
tts = TTS(model_name="tts_models/multilingual/multi-dataset/xtts_v2", progress_bar=True, gpu=False)
Gerar áudio
text = "esse é um teste de clonagem de voz" output_path = "D:/bkp_hd/Projetos/python/TTS_02/temp/output.wav" tts.tts_to_file(text=text, speaker_wav=wav_path, language="pt", file_path=output_path)
print(f"Áudio gerado com sucesso: {output_path}")
Expected behavior
No response
Logs
Environment
uv pip show TTS transformers soundfile
Name: soundfile
Version: 0.13.1
Location: D:\bkp_hd\Projetos\python\TTS_02\.venv\Lib\site-packages
Requires: cffi, numpy
Required-by: librosa, trainer, tts
---
Name: transformers
Version: 4.52.1
Location: D:\bkp_hd\Projetos\python\TTS_02\.venv\Lib\site-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by: tts
---
Required-by: tts
---
Name: tts
Version: 0.22.0
Additional context
No response
This is fixed in our fork (available via pip install coqui-tts). This repo is not maintained anymore.
Thank you very much
Downgrade the transformers. It works for me. pip install transformers==4.33.0
I will downgrade, tank you very much, it worked now...
I'd recommend to use the fork (pip install coqui-tts) instead. It supports newer transformer versions and has many other bugs fixed.
看了项目的 requirements.txt 内容,这个问题出现的根本原因是 TTS 没有限制依赖项的版本,导致安装了最新版本的依赖,而TTS本身又不兼容新版本,最好的办法是限制每一个依赖项的版本,一劳永逸的解决问题。
看了项目的 requirements.txt 内容,这个问题出现的根本原因是 TTS 没有限制依赖项的版本,导致安装了最新版本的依赖,而TTS本身又不兼容新版本,最好的办法是限制每一个依赖项的版本,一劳永逸的解决问题。
I translated this to English because it helped me alot to understand the issue - i used Chatgpt
Translation to English: 'After reviewing the project's requirements.txt, the root cause of the issue is that TTS does not specify version constraints for its dependencies. As a result, the latest versions of the dependencies are installed, which are not compatible with TTS itself. The best solution is to explicitly specify the version of each dependency to resolve the problem once and for all.'
This list of packages with the correpondant versions that helped me to run the inference of TTS correctly:
absl-py==2.3.0 aiohappyeyeballs==2.6.1 aiohttp==3.12.13 aiosignal==1.3.2 annotated-types==0.7.0 anyascii==0.3.2 attrs==25.3.0 audioread==3.0.1 babel==2.17.0 bangla==0.0.5 blinker==1.9.0 blis==1.2.1 bnnumerizer==0.0.2 bnunicodenormalizer==0.1.7 catalogue==2.0.10 certifi==2025.6.15 cffi==1.17.1 charset-normalizer==3.4.2 click==8.2.1 cloudpathlib==0.21.1 colorama==0.4.6 confection==0.1.5 contourpy==1.3.2 coqpit==0.0.17 cycler==0.12.1 cymem==2.0.11 Cython==3.1.2 dateparser==1.1.8 decorator==5.2.1 docopt==0.6.2 einops==0.8.1 encodec==0.1.1 filelock==3.18.0 Flask==3.1.1 fonttools==4.58.4 frozenlist==1.7.0 fsspec==2025.5.1 g2pkk==0.1.2 grpcio==1.73.0 gruut==2.2.3 gruut-ipa==0.13.0 gruut_lang_de==2.0.1 gruut_lang_en==2.0.1 gruut_lang_es==2.0.1 gruut_lang_fr==2.0.2 hangul-romanize==0.1.0 huggingface-hub==0.33.0 idna==3.10 inflect==7.5.0 itsdangerous==2.2.0 jamo==0.4.1 jieba==0.42.1 Jinja2==3.1.6 joblib==1.5.1 jsonlines==1.2.0 kiwisolver==1.4.8 langcodes==3.5.0 language_data==1.3.0 lazy_loader==0.4 librosa==0.11.0 llvmlite==0.44.0 marisa-trie==1.2.1 Markdown==3.8.2 markdown-it-py==3.0.0 MarkupSafe==3.0.2 matplotlib==3.10.3 mdurl==0.1.2 more-itertools==10.7.0 mpmath==1.3.0 msgpack==1.1.1 multidict==6.5.0 murmurhash==1.0.13 networkx==2.8.8 nltk==3.9.1 num2words==0.5.14 numba==0.61.2 numpy==1.26.4 packaging==25.0 pandas==1.5.3 pillow==11.0.0 platformdirs==4.3.8 pooch==1.8.2 preshed==3.0.10 propcache==0.3.2 protobuf==6.31.1 psutil==7.0.0 pycparser==2.22 pydantic==2.11.7 pydantic_core==2.33.2 Pygments==2.19.1 pynndescent==0.5.13 pyparsing==3.2.3 pypinyin==0.54.0 pysbd==0.3.4 python-crfsuite==0.9.11 python-dateutil==2.9.0.post0 pytz==2025.2 PyYAML==6.0.2 regex==2024.11.6 requests==2.32.4 rich==14.0.0 safetensors==0.5.3 scikit-learn==1.7.0 scipy==1.15.3 shellingham==1.5.4 six==1.17.0 smart-open==7.1.0 soundfile==0.13.1 soxr==0.5.0.post1 spacy==3.8.7 spacy-legacy==3.0.12 spacy-loggers==1.0.5 srsly==2.5.1 SudachiDict-core==20250515 SudachiPy==0.6.10 sympy==1.13.1 tensorboard==2.19.0 tensorboard-data-server==0.7.2 thinc==8.3.4 threadpoolctl==3.6.0 tokenizers==0.13.3 torch==2.5.1+cu121 torchaudio==2.5.1+cu121 torchvision==0.20.1+cu121 tqdm==4.67.1 trainer==0.0.36 transformers==4.33.0 TTS==0.22.0 typeguard==4.4.4 typer==0.16.0 typing-inspection==0.4.1 typing_extensions==4.14.0 tzdata==2025.2 tzlocal==5.3.1 umap-learn==0.5.7 Unidecode==1.4.0 urllib3==2.5.0 wasabi==1.1.3 weasel==0.4.1 Werkzeug==3.1.3 wrapt==1.17.2 yarl==1.20.1
@ibaoger @aspirant2018 As mentioned above, I'd recommend to just install the fork instead with pip install coqui-tts. It has been updated to work with recent versions of packages out of the box and includes many other bug fixes.
@eginhard yes, i have seen your comment and your recommendation. I will use the forked package sooner or later. Thank you again
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.