transformers icon indicating copy to clipboard operation
transformers copied to clipboard

ImportError: cannot import name 'GenerationMixin' from 'transformers.generation'

Open qsuzer opened this issue 7 months ago • 2 comments

System Info

Package Version Editable project location


accelerate 1.7.0 aiohappyeyeballs 2.4.4 aiohttp 3.11.9 aiosignal 1.3.1 altair 5.5.0 annotated-types 0.7.0 anyio 4.6.2.post1 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 arrow 1.3.0 asttokens 3.0.0 async-lru 2.0.5 async-timeout 5.0.1 attrs 24.2.0 babel 2.17.0 base58 2.1.1 beautifulsoup4 4.13.3 bitsandbytes 0.45.5 bleach 6.2.0 blinker 1.9.0 blis 0.7.11 bm25s 0.2.0 cachetools 5.5.0 catalogue 2.0.10 certifi 2024.8.30 cffi 1.17.1 charset-normalizer 3.4.0 click 8.1.7 coloredlogs 15.0.1 comm 0.2.2 confection 0.1.5 contourpy 1.3.0 cycler 0.12.1 cymem 2.0.10 Cython 3.0.11 dashscope 1.22.2 datasets 3.1.0 debugpy 1.8.13 decorator 5.2.1 defusedxml 0.7.1 dill 0.3.8 distro 1.9.0 docker-pycreds 0.4.0 eval_type_backport 0.2.2 exceptiongroup 1.2.2 executing 2.2.0 faiss-gpu 1.7.2 fastapi 0.115.6 fastjsonschema 2.21.1 filelock 3.16.1 flashrag-dev 0.1.4.dev0 /home/wmz/FlashRAG flatbuffers 24.3.25 fonttools 4.56.0 fqdn 1.5.1 frozenlist 1.5.0 fschat 0.2.36 fsspec 2024.9.0 gitdb 4.0.11 GitPython 3.1.43 h11 0.14.0 hf-xet 1.1.2 httpcore 1.0.7 httpx 0.28.0 huggingface-hub 0.32.2 humanfriendly 10.0 idna 3.10 importlib_metadata 8.6.1 importlib_resources 6.5.2 ipykernel 6.29.5 ipython 8.18.1 ipywidgets 8.1.5 isoduration 20.11.0 jedi 0.19.2 Jinja2 3.1.4 jiter 0.8.0 joblib 1.4.2 json5 0.10.0 jsonlines 4.0.0 jsonpointer 3.0.0 jsonschema 4.23.0 jsonschema-specifications 2024.10.1 jupyter 1.1.1 jupyter_client 8.6.3 jupyter-console 6.6.3 jupyter_core 5.7.2 jupyter-events 0.12.0 jupyter-lsp 2.2.5 jupyter_server 2.15.0 jupyter_server_terminals 0.5.3 jupyterlab 4.3.6 jupyterlab_pygments 0.3.0 jupyterlab_server 2.27.3 jupyterlab_widgets 3.0.13 kiwisolver 1.4.7 langcodes 3.5.0 language_data 1.3.0 latex2mathml 3.77.0 lightgbm 4.5.0 llvmlite 0.43.0 marisa-trie 1.2.1 markdown-it-py 3.0.0 markdown2 2.5.1 MarkupSafe 3.0.2 matplotlib 3.9.4 matplotlib-inline 0.1.7 mdurl 0.1.2 mistune 3.1.3 modelscope 1.21.0 mpmath 1.3.0 multidict 6.1.0 multiprocess 0.70.16 murmurhash 1.0.11 narwhals 1.15.2 nbclient 0.10.2 nbconvert 7.16.6 nbformat 5.10.4 nest-asyncio 1.6.0 networkx 3.2.1 nh3 0.2.19 nltk 3.9.1 nmslib 2.1.1 notebook 7.3.3 notebook_shim 0.2.4 numba 0.60.0 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.18.1 nvidia-nvjitlink-cu12 12.4.127 nvidia-nvtx-cu12 12.1.105 onnxruntime 1.19.2 openai 1.56.2 orjson 3.10.12 overrides 7.7.0 packaging 24.2 pandas 2.2.3 pandocfilters 1.5.1 parso 0.8.4 pathlib_abc 0.1.1 pathy 0.11.0 peft 0.13.2 pexpect 4.9.0 pillow 11.0.0 pip 24.3.1 platformdirs 4.3.7 preshed 3.0.9 prometheus_client 0.21.1 prompt_toolkit 3.0.48 propcache 0.2.1 protobuf 5.29.1 psutil 6.1.0 ptyprocess 0.7.0 pure_eval 0.2.3 pyarrow 18.1.0 pybind11 2.6.1 pycparser 2.22 pydantic 2.10.3 pydantic_core 2.27.1 pydeck 0.9.1 Pygments 2.18.0 pyjnius 1.6.1 pyparsing 3.2.1 pyserini 0.22.1 PyStemmer 2.2.0.3 python-dateutil 2.9.0.post0 python-json-logger 3.3.0 pytz 2024.2 PyYAML 6.0.2 pyzmq 26.3.0 qwen-agent 0.0.16 rank-bm25 0.2.2 referencing 0.35.1 regex 2024.11.6 requests 2.32.3 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rich 13.9.4 rouge 1.0.1 rpds-py 0.22.3 safetensors 0.4.6.dev0 scikit-learn 1.6.0 scipy 1.10.1 seaborn 0.13.2 Send2Trash 1.8.3 sentence-transformers 3.3.1 sentencepiece 0.2.0 sentry-sdk 2.29.1 setproctitle 1.3.6 setuptools 75.6.0 shortuuid 1.0.13 six 1.17.0 smart-open 6.4.0 smmap 5.0.1 sniffio 1.3.1 soupsieve 2.6 spacy 3.6.1 spacy-legacy 3.0.12 spacy-loggers 1.0.5 srsly 2.4.8 stack-data 0.6.3 starlette 0.41.3 streamlit 1.40.2 svgwrite 1.4.3 sympy 1.13.1 tenacity 9.0.0 terminado 0.18.1 thinc 8.1.12 threadpoolctl 3.5.0 tiktoken 0.8.0 tinycss2 1.4.0 tokenizers 0.21.1 toml 0.10.2 tomli 2.2.1 torch 2.1.2 tornado 6.4.2 tqdm 4.67.1 traitlets 5.14.3 transformers 4.52.3 triton 2.1.0 trl 0.19.0 typer 0.9.4 types-python-dateutil 2.9.0.20241206 typing_extensions 4.12.2 tzdata 2024.2 uri-template 1.3.0 urllib3 2.2.3 uvicorn 0.32.1 wandb 0.19.11 wasabi 1.1.3 watchdog 6.0.0 wavedrom 2.0.3.post3 wcwidth 0.2.13 webcolors 24.11.1 webencodings 0.5.1 websocket-client 1.8.0 wheel 0.45.1 widgetsnbextension 4.0.13 xxhash 3.5.0 yarl 1.18.3 zipp 3.21.0

Who can help?

No response

Information

  • [ ] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

 import json

import torch import logging from datasets import Dataset from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig from peft import LoraConfig, TaskType, prepare_model_for_kbit_training from trl import SFTTrainer, SFTConfig from tqdm import tqdm import os

model, tokenizer = prepare_model_and_tokenizer(MODEL_NAME)

    # LoRA配置
    peft_config = LoraConfig(
        r=16,
        lora_alpha=32,
        target_modules=["q_proj", "v_proj", "k_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
        lora_dropout=0.1,
        bias="none",
        task_type=TaskType.CAUSAL_LM,
    )

    # SFT配置
    sft_config = SFTConfig(
        output_dir=OUTPUT_DIR,
        num_train_epochs=3,
        per_device_train_batch_size=4,
        per_device_eval_batch_size=4,
        gradient_accumulation_steps=4,
        optim="paged_adamw_8bit",
        save_steps=500,
        logging_steps=50,
        learning_rate=2e-4,
        weight_decay=0.001,
        fp16=True,
        bf16=False,
        max_grad_norm=0.3,
        warmup_ratio=0.03,
        lr_scheduler_type="cosine",
        eval_strategy="steps",
        eval_steps=500,
        save_total_limit=2,
        load_best_model_at_end=True,
        report_to="none",
        max_seq_length=512,
        packing=False,
        dataset_text_field="text",
    )

    # SFT训练器
    trainer = SFTTrainer(
        model=model,
        args=sft_config,
        train_dataset=train_dataset,
        eval_dataset=eval_dataset,
        processing_class=tokenizer,
        peft_config=peft_config,
        formatting_func=None,
    )

Expected behavior

Traceback (most recent call last): File "/home/wmz/FlashRAG/train_decomposer.py", line 5, in from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments File "/data/anaconda3/envs/flashrag/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 2045, in getattr module = self._get_module(self._class_to_module[name]) File "/data/anaconda3/envs/flashrag/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 2075, in _get_module raise e File "/data/anaconda3/envs/flashrag/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 2073, in _get_module return importlib.import_module("." + module_name, self.name) File "/data/anaconda3/envs/flashrag/lib/python3.9/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "/data/anaconda3/envs/flashrag/lib/python3.9/site-packages/transformers/models/auto/modeling_auto.py", line 21, in from .auto_factory import ( File "/data/anaconda3/envs/flashrag/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 40, in from ...generation import GenerationMixin ImportError: cannot import name 'GenerationMixin' from 'transformers.generation' (/data/anaconda3/envs/flashrag/lib/python3.9/site-packages/transformers/generation/init.py)

I installed through "pip install transformers" and the version is 4.52.3

qsuzer avatar May 28 '25 14:05 qsuzer

Hi @qsuzer, this seems like an environment issue that we probably can't debug for you! Can you try making a fresh environment? GenerationMixin should be importable.

Rocketknight1 avatar May 28 '25 15:05 Rocketknight1

Same problem. Have you addressed it?

baojunqi avatar Jun 14 '25 10:06 baojunqi

Same issue here when running import transformers.models.bert.modeling_bert

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "miniforge3/envs/mapping/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 31, in <module>
    from ...generation import GenerationMixin
ImportError: cannot import name 'GenerationMixin' from 'transformers.generation' (miniforge3/envs/mapping/lib/python3.10/site-packages/transformers/generation/__init__.py)

Related: #36010

cramsuig avatar Jun 18 '25 09:06 cramsuig

This is very weird. Can anyone give us a reproducible environment? For example, if we could run some code in Colab to see this error, it would be much easier to diagnose. Importing GenerationMixin works fine for me on my local machine.

Rocketknight1 avatar Jun 18 '25 13:06 Rocketknight1

Hi! I'm encountering the same error when using transformers inside a Kaggle notebook with a custom virtual environment, after i use the option "Run and Save":

ImportError: cannot import name 'GenerationMixin' from 'transformers.generation' 
(/kaggle/working/./venv/lib/python3.11/site-packages/transformers/generation/__init__.py)

❗ Important Context

  • I’m not importing GenerationMixin directly.
  • This error occurs when I simply run:
from transformers import AutoTokenizer, AutoModelForCausalLM
  • My setup is inside a Kaggle notebook using a custom virtual environment:
!pip install virtualenv
!virtualenv venv
!./venv/bin/pip install transformers torch evaluate bert_score

And in Python:

import sys
sys.path.insert(0, './venv/lib/python3.11/site-packages')
from transformers import AutoTokenizer, AutoModelForCausalLM  # triggers the error

🔍 Diagnosis

It looks like the installed transformers package is trying to import GenerationMixin from:

transformers.generation

But GenerationMixin actually resides in:

transformers.generation.utils

That suggests:

  • Either the __init__.py of transformers.generation is outdated/misconfigured,
  • Or there's a packaging issue with the specific version being pulled into the venv on Kaggle.

🔁 Reproducibility

This happens consistently on Kaggle in a Python 3.11 environment with a virtualenv created inside the notebook. The issue might not show up in system-wide installs, but happens when running the following code from the "Run and Save" option in Kaggle notebooks:

!virtualenv venv
!./venv/bin/pip install transformers

The error didn't happen when running the draft session, it only happened during the external session created after "Run and Save" option on the notebook. I’d be happy to provide a minimal notebook example if needed!

Thanks in advance 🙏

YoussefMaghrebi avatar Jul 04 '25 05:07 YoussefMaghrebi

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jul 28 '25 08:07 github-actions[bot]

Issue is still present as of september 2025, I am using a fresh venv in python 3.12 with the Qwen provided piece of code

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-32B")
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-32B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Same error as cited previously

Tfloow avatar Sep 29 '25 14:09 Tfloow

I can't figure this out - the import works perfectly for me from either generation or generation.utils. The only thing that could break this is a missing torch dependency, because those classes depend on Torch and won't be initialized if it isn't present.

Rocketknight1 avatar Sep 30 '25 14:09 Rocketknight1

Faced this issue, but got fixed after reloading the jupyter notebook. I think it has to something to do with torch dependencies or atleast in my case. Code: import torch from transformers import ( AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments, Trainer ) from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training from datasets import Dataset import json

  1. Initially importing torch module failed. I installed torch with command "pip install torch transformers peft...... "
  2. Reinstalled torch using command from https://pytorch.org/get-started/locally/ for my CUDA Windows, - command "pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu126"
  3. Reran this block, now torch was imported but our current issue came "ImportError: cannot import name 'GenerationMixin' from 'transformers.generation'" in second line.
  4. Uninstalled transfomers package and reinstalled the same, no use yet.
  5. But after restarting the entire notebook, it worked fine. its a pretty new machine, these packages were installed for the first time.

harish-raj-t avatar Oct 14 '25 07:10 harish-raj-t

I solve it, my python is 3.12, transformers==4.51.0, need to update setuptools, as follow: pip install --upgrade setuptools, you can try it.

tzhang2014 avatar Oct 14 '25 08:10 tzhang2014