transformers ImportError: cannot import name 'GenerationMixin' from 'transformers.generation'

ImportError: cannot import name 'GenerationMixin' from 'transformers.generation'

Open qsuzer opened this issue 5 months ago • 2 comments

System Info

Package Version Editable project location

accelerate 1.7.0 aiohappyeyeballs 2.4.4 aiohttp 3.11.9 aiosignal 1.3.1 altair 5.5.0 annotated-types 0.7.0 anyio 4.6.2.post1 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 arrow 1.3.0 asttokens 3.0.0 async-lru 2.0.5 async-timeout 5.0.1 attrs 24.2.0 babel 2.17.0 base58 2.1.1 beautifulsoup4 4.13.3 bitsandbytes 0.45.5 bleach 6.2.0 blinker 1.9.0 blis 0.7.11 bm25s 0.2.0 cachetools 5.5.0 catalogue 2.0.10 certifi 2024.8.30 cffi 1.17.1 charset-normalizer 3.4.0 click 8.1.7 coloredlogs 15.0.1 comm 0.2.2 confection 0.1.5 contourpy 1.3.0 cycler 0.12.1 cymem 2.0.10 Cython 3.0.11 dashscope 1.22.2 datasets 3.1.0 debugpy 1.8.13 decorator 5.2.1 defusedxml 0.7.1 dill 0.3.8 distro 1.9.0 docker-pycreds 0.4.0 eval_type_backport 0.2.2 exceptiongroup 1.2.2 executing 2.2.0 faiss-gpu 1.7.2 fastapi 0.115.6 fastjsonschema 2.21.1 filelock 3.16.1 flashrag-dev 0.1.4.dev0 /home/wmz/FlashRAG flatbuffers 24.3.25 fonttools 4.56.0 fqdn 1.5.1 frozenlist 1.5.0 fschat 0.2.36 fsspec 2024.9.0 gitdb 4.0.11 GitPython 3.1.43 h11 0.14.0 hf-xet 1.1.2 httpcore 1.0.7 httpx 0.28.0 huggingface-hub 0.32.2 humanfriendly 10.0 idna 3.10 importlib_metadata 8.6.1 importlib_resources 6.5.2 ipykernel 6.29.5 ipython 8.18.1 ipywidgets 8.1.5 isoduration 20.11.0 jedi 0.19.2 Jinja2 3.1.4 jiter 0.8.0 joblib 1.4.2 json5 0.10.0 jsonlines 4.0.0 jsonpointer 3.0.0 jsonschema 4.23.0 jsonschema-specifications 2024.10.1 jupyter 1.1.1 jupyter_client 8.6.3 jupyter-console 6.6.3 jupyter_core 5.7.2 jupyter-events 0.12.0 jupyter-lsp 2.2.5 jupyter_server 2.15.0 jupyter_server_terminals 0.5.3 jupyterlab 4.3.6 jupyterlab_pygments 0.3.0 jupyterlab_server 2.27.3 jupyterlab_widgets 3.0.13 kiwisolver 1.4.7 langcodes 3.5.0 language_data 1.3.0 latex2mathml 3.77.0 lightgbm 4.5.0 llvmlite 0.43.0 marisa-trie 1.2.1 markdown-it-py 3.0.0 markdown2 2.5.1 MarkupSafe 3.0.2 matplotlib 3.9.4 matplotlib-inline 0.1.7 mdurl 0.1.2 mistune 3.1.3 modelscope 1.21.0 mpmath 1.3.0 multidict 6.1.0 multiprocess 0.70.16 murmurhash 1.0.11 narwhals 1.15.2 nbclient 0.10.2 nbconvert 7.16.6 nbformat 5.10.4 nest-asyncio 1.6.0 networkx 3.2.1 nh3 0.2.19 nltk 3.9.1 nmslib 2.1.1 notebook 7.3.3 notebook_shim 0.2.4 numba 0.60.0 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.18.1 nvidia-nvjitlink-cu12 12.4.127 nvidia-nvtx-cu12 12.1.105 onnxruntime 1.19.2 openai 1.56.2 orjson 3.10.12 overrides 7.7.0 packaging 24.2 pandas 2.2.3 pandocfilters 1.5.1 parso 0.8.4 pathlib_abc 0.1.1 pathy 0.11.0 peft 0.13.2 pexpect 4.9.0 pillow 11.0.0 pip 24.3.1 platformdirs 4.3.7 preshed 3.0.9 prometheus_client 0.21.1 prompt_toolkit 3.0.48 propcache 0.2.1 protobuf 5.29.1 psutil 6.1.0 ptyprocess 0.7.0 pure_eval 0.2.3 pyarrow 18.1.0 pybind11 2.6.1 pycparser 2.22 pydantic 2.10.3 pydantic_core 2.27.1 pydeck 0.9.1 Pygments 2.18.0 pyjnius 1.6.1 pyparsing 3.2.1 pyserini 0.22.1 PyStemmer 2.2.0.3 python-dateutil 2.9.0.post0 python-json-logger 3.3.0 pytz 2024.2 PyYAML 6.0.2 pyzmq 26.3.0 qwen-agent 0.0.16 rank-bm25 0.2.2 referencing 0.35.1 regex 2024.11.6 requests 2.32.3 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rich 13.9.4 rouge 1.0.1 rpds-py 0.22.3 safetensors 0.4.6.dev0 scikit-learn 1.6.0 scipy 1.10.1 seaborn 0.13.2 Send2Trash 1.8.3 sentence-transformers 3.3.1 sentencepiece 0.2.0 sentry-sdk 2.29.1 setproctitle 1.3.6 setuptools 75.6.0 shortuuid 1.0.13 six 1.17.0 smart-open 6.4.0 smmap 5.0.1 sniffio 1.3.1 soupsieve 2.6 spacy 3.6.1 spacy-legacy 3.0.12 spacy-loggers 1.0.5 srsly 2.4.8 stack-data 0.6.3 starlette 0.41.3 streamlit 1.40.2 svgwrite 1.4.3 sympy 1.13.1 tenacity 9.0.0 terminado 0.18.1 thinc 8.1.12 threadpoolctl 3.5.0 tiktoken 0.8.0 tinycss2 1.4.0 tokenizers 0.21.1 toml 0.10.2 tomli 2.2.1 torch 2.1.2 tornado 6.4.2 tqdm 4.67.1 traitlets 5.14.3 transformers 4.52.3 triton 2.1.0 trl 0.19.0 typer 0.9.4 types-python-dateutil 2.9.0.20241206 typing_extensions 4.12.2 tzdata 2024.2 uri-template 1.3.0 urllib3 2.2.3 uvicorn 0.32.1 wandb 0.19.11 wasabi 1.1.3 watchdog 6.0.0 wavedrom 2.0.3.post3 wcwidth 0.2.13 webcolors 24.11.1 webencodings 0.5.1 websocket-client 1.8.0 wheel 0.45.1 widgetsnbextension 4.0.13 xxhash 3.5.0 yarl 1.18.3 zipp 3.21.0

Who can help?

No response

Information

[ ] The official example scripts
[ ] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

 import json

import torch import logging from datasets import Dataset from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig from peft import LoraConfig, TaskType, prepare_model_for_kbit_training from trl import SFTTrainer, SFTConfig from tqdm import tqdm import os

model, tokenizer = prepare_model_and_tokenizer(MODEL_NAME)

    # LoRA配置
    peft_config = LoraConfig(
        r=16,
        lora_alpha=32,
        target_modules=["q_proj", "v_proj", "k_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
        lora_dropout=0.1,
        bias="none",
        task_type=TaskType.CAUSAL_LM,
    )

    # SFT配置
    sft_config = SFTConfig(
        output_dir=OUTPUT_DIR,
        num_train_epochs=3,
        per_device_train_batch_size=4,
        per_device_eval_batch_size=4,
        gradient_accumulation_steps=4,
        optim="paged_adamw_8bit",
        save_steps=500,
        logging_steps=50,
        learning_rate=2e-4,
        weight_decay=0.001,
        fp16=True,
        bf16=False,
        max_grad_norm=0.3,
        warmup_ratio=0.03,
        lr_scheduler_type="cosine",
        eval_strategy="steps",
        eval_steps=500,
        save_total_limit=2,
        load_best_model_at_end=True,
        report_to="none",
        max_seq_length=512,
        packing=False,
        dataset_text_field="text",
    )

    # SFT训练器
    trainer = SFTTrainer(
        model=model,
        args=sft_config,
        train_dataset=train_dataset,
        eval_dataset=eval_dataset,
        processing_class=tokenizer,
        peft_config=peft_config,
        formatting_func=None,
    )

Expected behavior

Traceback (most recent call last): File "/home/wmz/FlashRAG/train_decomposer.py", line 5, in from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments File "/data/anaconda3/envs/flashrag/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 2045, in getattr module = self._get_module(self._class_to_module[name]) File "/data/anaconda3/envs/flashrag/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 2075, in _get_module raise e File "/data/anaconda3/envs/flashrag/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 2073, in _get_module return importlib.import_module("." + module_name, self.name) File "/data/anaconda3/envs/flashrag/lib/python3.9/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "/data/anaconda3/envs/flashrag/lib/python3.9/site-packages/transformers/models/auto/modeling_auto.py", line 21, in from .auto_factory import ( File "/data/anaconda3/envs/flashrag/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 40, in from ...generation import GenerationMixin ImportError: cannot import name 'GenerationMixin' from 'transformers.generation' (/data/anaconda3/envs/flashrag/lib/python3.9/site-packages/transformers/generation/init.py)

I installed through "pip install transformers" and the version is 4.52.3

May 28 '25 14:05 qsuzer

transformers transformers copied to clipboard

ImportError: cannot import name 'GenerationMixin' from 'transformers.generation'

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

transformers
transformers copied to clipboard