ImportError: cannot import name 'GenerationMixin' from 'transformers.generation'
System Info
Package Version Editable project location
accelerate 1.7.0 aiohappyeyeballs 2.4.4 aiohttp 3.11.9 aiosignal 1.3.1 altair 5.5.0 annotated-types 0.7.0 anyio 4.6.2.post1 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 arrow 1.3.0 asttokens 3.0.0 async-lru 2.0.5 async-timeout 5.0.1 attrs 24.2.0 babel 2.17.0 base58 2.1.1 beautifulsoup4 4.13.3 bitsandbytes 0.45.5 bleach 6.2.0 blinker 1.9.0 blis 0.7.11 bm25s 0.2.0 cachetools 5.5.0 catalogue 2.0.10 certifi 2024.8.30 cffi 1.17.1 charset-normalizer 3.4.0 click 8.1.7 coloredlogs 15.0.1 comm 0.2.2 confection 0.1.5 contourpy 1.3.0 cycler 0.12.1 cymem 2.0.10 Cython 3.0.11 dashscope 1.22.2 datasets 3.1.0 debugpy 1.8.13 decorator 5.2.1 defusedxml 0.7.1 dill 0.3.8 distro 1.9.0 docker-pycreds 0.4.0 eval_type_backport 0.2.2 exceptiongroup 1.2.2 executing 2.2.0 faiss-gpu 1.7.2 fastapi 0.115.6 fastjsonschema 2.21.1 filelock 3.16.1 flashrag-dev 0.1.4.dev0 /home/wmz/FlashRAG flatbuffers 24.3.25 fonttools 4.56.0 fqdn 1.5.1 frozenlist 1.5.0 fschat 0.2.36 fsspec 2024.9.0 gitdb 4.0.11 GitPython 3.1.43 h11 0.14.0 hf-xet 1.1.2 httpcore 1.0.7 httpx 0.28.0 huggingface-hub 0.32.2 humanfriendly 10.0 idna 3.10 importlib_metadata 8.6.1 importlib_resources 6.5.2 ipykernel 6.29.5 ipython 8.18.1 ipywidgets 8.1.5 isoduration 20.11.0 jedi 0.19.2 Jinja2 3.1.4 jiter 0.8.0 joblib 1.4.2 json5 0.10.0 jsonlines 4.0.0 jsonpointer 3.0.0 jsonschema 4.23.0 jsonschema-specifications 2024.10.1 jupyter 1.1.1 jupyter_client 8.6.3 jupyter-console 6.6.3 jupyter_core 5.7.2 jupyter-events 0.12.0 jupyter-lsp 2.2.5 jupyter_server 2.15.0 jupyter_server_terminals 0.5.3 jupyterlab 4.3.6 jupyterlab_pygments 0.3.0 jupyterlab_server 2.27.3 jupyterlab_widgets 3.0.13 kiwisolver 1.4.7 langcodes 3.5.0 language_data 1.3.0 latex2mathml 3.77.0 lightgbm 4.5.0 llvmlite 0.43.0 marisa-trie 1.2.1 markdown-it-py 3.0.0 markdown2 2.5.1 MarkupSafe 3.0.2 matplotlib 3.9.4 matplotlib-inline 0.1.7 mdurl 0.1.2 mistune 3.1.3 modelscope 1.21.0 mpmath 1.3.0 multidict 6.1.0 multiprocess 0.70.16 murmurhash 1.0.11 narwhals 1.15.2 nbclient 0.10.2 nbconvert 7.16.6 nbformat 5.10.4 nest-asyncio 1.6.0 networkx 3.2.1 nh3 0.2.19 nltk 3.9.1 nmslib 2.1.1 notebook 7.3.3 notebook_shim 0.2.4 numba 0.60.0 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.18.1 nvidia-nvjitlink-cu12 12.4.127 nvidia-nvtx-cu12 12.1.105 onnxruntime 1.19.2 openai 1.56.2 orjson 3.10.12 overrides 7.7.0 packaging 24.2 pandas 2.2.3 pandocfilters 1.5.1 parso 0.8.4 pathlib_abc 0.1.1 pathy 0.11.0 peft 0.13.2 pexpect 4.9.0 pillow 11.0.0 pip 24.3.1 platformdirs 4.3.7 preshed 3.0.9 prometheus_client 0.21.1 prompt_toolkit 3.0.48 propcache 0.2.1 protobuf 5.29.1 psutil 6.1.0 ptyprocess 0.7.0 pure_eval 0.2.3 pyarrow 18.1.0 pybind11 2.6.1 pycparser 2.22 pydantic 2.10.3 pydantic_core 2.27.1 pydeck 0.9.1 Pygments 2.18.0 pyjnius 1.6.1 pyparsing 3.2.1 pyserini 0.22.1 PyStemmer 2.2.0.3 python-dateutil 2.9.0.post0 python-json-logger 3.3.0 pytz 2024.2 PyYAML 6.0.2 pyzmq 26.3.0 qwen-agent 0.0.16 rank-bm25 0.2.2 referencing 0.35.1 regex 2024.11.6 requests 2.32.3 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rich 13.9.4 rouge 1.0.1 rpds-py 0.22.3 safetensors 0.4.6.dev0 scikit-learn 1.6.0 scipy 1.10.1 seaborn 0.13.2 Send2Trash 1.8.3 sentence-transformers 3.3.1 sentencepiece 0.2.0 sentry-sdk 2.29.1 setproctitle 1.3.6 setuptools 75.6.0 shortuuid 1.0.13 six 1.17.0 smart-open 6.4.0 smmap 5.0.1 sniffio 1.3.1 soupsieve 2.6 spacy 3.6.1 spacy-legacy 3.0.12 spacy-loggers 1.0.5 srsly 2.4.8 stack-data 0.6.3 starlette 0.41.3 streamlit 1.40.2 svgwrite 1.4.3 sympy 1.13.1 tenacity 9.0.0 terminado 0.18.1 thinc 8.1.12 threadpoolctl 3.5.0 tiktoken 0.8.0 tinycss2 1.4.0 tokenizers 0.21.1 toml 0.10.2 tomli 2.2.1 torch 2.1.2 tornado 6.4.2 tqdm 4.67.1 traitlets 5.14.3 transformers 4.52.3 triton 2.1.0 trl 0.19.0 typer 0.9.4 types-python-dateutil 2.9.0.20241206 typing_extensions 4.12.2 tzdata 2024.2 uri-template 1.3.0 urllib3 2.2.3 uvicorn 0.32.1 wandb 0.19.11 wasabi 1.1.3 watchdog 6.0.0 wavedrom 2.0.3.post3 wcwidth 0.2.13 webcolors 24.11.1 webencodings 0.5.1 websocket-client 1.8.0 wheel 0.45.1 widgetsnbextension 4.0.13 xxhash 3.5.0 yarl 1.18.3 zipp 3.21.0
Who can help?
No response
Information
- [ ] The official example scripts
- [ ] My own modified scripts
Tasks
- [ ] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
import json
import torch import logging from datasets import Dataset from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig from peft import LoraConfig, TaskType, prepare_model_for_kbit_training from trl import SFTTrainer, SFTConfig from tqdm import tqdm import os
model, tokenizer = prepare_model_and_tokenizer(MODEL_NAME)
# LoRA配置
peft_config = LoraConfig(
r=16,
lora_alpha=32,
target_modules=["q_proj", "v_proj", "k_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
lora_dropout=0.1,
bias="none",
task_type=TaskType.CAUSAL_LM,
)
# SFT配置
sft_config = SFTConfig(
output_dir=OUTPUT_DIR,
num_train_epochs=3,
per_device_train_batch_size=4,
per_device_eval_batch_size=4,
gradient_accumulation_steps=4,
optim="paged_adamw_8bit",
save_steps=500,
logging_steps=50,
learning_rate=2e-4,
weight_decay=0.001,
fp16=True,
bf16=False,
max_grad_norm=0.3,
warmup_ratio=0.03,
lr_scheduler_type="cosine",
eval_strategy="steps",
eval_steps=500,
save_total_limit=2,
load_best_model_at_end=True,
report_to="none",
max_seq_length=512,
packing=False,
dataset_text_field="text",
)
# SFT训练器
trainer = SFTTrainer(
model=model,
args=sft_config,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
processing_class=tokenizer,
peft_config=peft_config,
formatting_func=None,
)
Expected behavior
Traceback (most recent call last):
File "/home/wmz/FlashRAG/train_decomposer.py", line 5, in
I installed through "pip install transformers" and the version is 4.52.3
Hi @qsuzer, this seems like an environment issue that we probably can't debug for you! Can you try making a fresh environment? GenerationMixin should be importable.
Same problem. Have you addressed it?
Same issue here when running import transformers.models.bert.modeling_bert
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "miniforge3/envs/mapping/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 31, in <module>
from ...generation import GenerationMixin
ImportError: cannot import name 'GenerationMixin' from 'transformers.generation' (miniforge3/envs/mapping/lib/python3.10/site-packages/transformers/generation/__init__.py)
Related: #36010
This is very weird. Can anyone give us a reproducible environment? For example, if we could run some code in Colab to see this error, it would be much easier to diagnose. Importing GenerationMixin works fine for me on my local machine.
Hi! I'm encountering the same error when using transformers inside a Kaggle notebook with a custom virtual environment, after i use the option "Run and Save":
ImportError: cannot import name 'GenerationMixin' from 'transformers.generation'
(/kaggle/working/./venv/lib/python3.11/site-packages/transformers/generation/__init__.py)
❗ Important Context
- I’m not importing
GenerationMixindirectly. - This error occurs when I simply run:
from transformers import AutoTokenizer, AutoModelForCausalLM
- My setup is inside a Kaggle notebook using a custom virtual environment:
!pip install virtualenv
!virtualenv venv
!./venv/bin/pip install transformers torch evaluate bert_score
And in Python:
import sys
sys.path.insert(0, './venv/lib/python3.11/site-packages')
from transformers import AutoTokenizer, AutoModelForCausalLM # triggers the error
🔍 Diagnosis
It looks like the installed transformers package is trying to import GenerationMixin from:
transformers.generation
But GenerationMixin actually resides in:
transformers.generation.utils
That suggests:
- Either the
__init__.pyoftransformers.generationis outdated/misconfigured, - Or there's a packaging issue with the specific version being pulled into the venv on Kaggle.
🔁 Reproducibility
This happens consistently on Kaggle in a Python 3.11 environment with a virtualenv created inside the notebook. The issue might not show up in system-wide installs, but happens when running the following code from the "Run and Save" option in Kaggle notebooks:
!virtualenv venv
!./venv/bin/pip install transformers
The error didn't happen when running the draft session, it only happened during the external session created after "Run and Save" option on the notebook. I’d be happy to provide a minimal notebook example if needed!
Thanks in advance 🙏
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Issue is still present as of september 2025, I am using a fresh venv in python 3.12 with the Qwen provided piece of code
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-32B")
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-32B")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
Same error as cited previously
I can't figure this out - the import works perfectly for me from either generation or generation.utils. The only thing that could break this is a missing torch dependency, because those classes depend on Torch and won't be initialized if it isn't present.
Faced this issue, but got fixed after reloading the jupyter notebook. I think it has to something to do with torch dependencies or atleast in my case. Code: import torch from transformers import ( AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments, Trainer ) from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training from datasets import Dataset import json
- Initially importing torch module failed. I installed torch with command "pip install torch transformers peft...... "
- Reinstalled torch using command from https://pytorch.org/get-started/locally/ for my CUDA Windows, - command "pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu126"
- Reran this block, now torch was imported but our current issue came "ImportError: cannot import name 'GenerationMixin' from 'transformers.generation'" in second line.
- Uninstalled transfomers package and reinstalled the same, no use yet.
- But after restarting the entire notebook, it worked fine. its a pretty new machine, these packages were installed for the first time.
I solve it, my python is 3.12, transformers==4.51.0, need to update setuptools, as follow: pip install --upgrade setuptools, you can try it.