ColossalAI [BUG]:From server.py: ValueError: The following `model_kwargs` are not used by the model: ['token_type

🐛 Describe the bug

I run my server with this: python3 ./ColossalAI/applications/Chat/inference/server.py /home/ubuntu/modelpath/llama-7b/llama-7b/ --quant 8bit --http_host 0.0.0.0 --http_port 8080

then I call the api with this: import requests

import request data = {"history": [{"instruction":"where is the capital of USA", "response":""}], "max_new_tokens": 150, "top_k": 30, "top_p": 0.5, "temperature": 0.6} response = requests.post("http://localhost:8080/generate/stream", json=data) print(response.text)

As result I got this:

Traceback (most recent call last): File "/opt/conda/envs/deploy/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 429, in run_asgi result = await app( # type: ignore[func-returns-value] File "/opt/conda/envs/deploy/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in call return await self.app(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/fastapi/applications.py", line 276, in call await super().call(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/applications.py", line 122, in call await self.middleware_stack(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/middleware/errors.py", line 184, in call raise exc File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/middleware/errors.py", line 162, in call await self.app(scope, receive, _send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/middleware/cors.py", line 84, in call await self.app(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 79, in call raise exc File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 68, in call await self.app(scope, receive, sender) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in call raise e File "/opt/conda/envs/deploy/lib/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in call await self.app(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/routing.py", line 718, in call await route.handle(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/routing.py", line 276, in handle await self.app(scope, receive, send) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/routing.py", line 66, in app response = await func(request) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/fastapi/routing.py", line 237, in app raw_response = await run_endpoint_function( File "/opt/conda/envs/deploy/lib/python3.9/site-packages/fastapi/routing.py", line 165, in run_endpoint_function return await run_in_threadpool(dependant.call, **values) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool return await anyio.to_thread.run_sync(func, *args) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/opt/conda/envs/deploy/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/opt/conda/envs/deploy/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, *args) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/slowapi/extension.py", line 762, in sync_wrapper response = func(*args, **kwargs) File "/home/ubuntu/./ColossalAI/applications/Chat/inference/server.py", line 118, in generate_no_stream output = model.generate(**inputs, **data.dict(exclude={'history'})) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/transformers/generation/utils.py", line 1231, in generate self._validate_model_kwargs(model_kwargs.copy()) File "/opt/conda/envs/deploy/lib/python3.9/site-packages/transformers/generation/utils.py", line 1109, in _validate_model_kwargs raise ValueError( ValueError: The following model_kwargs are not used by the model: ['token_type_ids'] (note: typos in the generate arguments will also show up in this list)

Does anybody know how to resolve this?

Environment

my package version is as below: Package Version

accelerate 0.18.0 anyio 3.6.2 bitsandbytes 0.37.2 Brotli 1.0.9 certifi 2022.12.7 charset-normalizer 3.1.0 click 8.1.3 cmake 3.26.1 ConfigArgParse 1.5.3 Deprecated 1.2.13 fastapi 0.95.0 filelock 3.11.0 Flask 2.2.3 Flask-BasicAuth 0.2.0 Flask-Cors 3.0.10 gevent 22.10.2 geventhttpclient 2.0.9 greenlet 2.0.2 h11 0.14.0 huggingface-hub 0.13.4 idna 3.4 importlib-metadata 6.2.0 importlib-resources 5.12.0 itsdangerous 2.1.2 jieba 0.42.1 Jinja2 3.1.2 limits 3.3.1 lit 16.0.0 locust 2.15.1 MarkupSafe 2.1.2 mpmath 1.3.0 msgpack 1.0.5 networkx 3.1 numpy 1.24.2 nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 nvidia-cufft-cu11 10.9.0.58 nvidia-curand-cu11 10.2.10.91 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusparse-cu11 11.7.4.91 nvidia-nccl-cu11 2.14.3 nvidia-nvtx-cu11 11.7.91 packaging 23.0 pip 23.0.1 protobuf 4.22.1 psutil 5.9.4 pydantic 1.10.7 PyYAML 6.0 pyzmq 25.0.2 regex 2023.3.23 requests 2.28.2 roundrobin 0.0.4 safetensors 0.3.0 sentencepiece 0.1.97 setuptools 67.6.1 six 1.16.0 slowapi 0.1.8 sniffio 1.3.0 sse-starlette 1.3.3 starlette 0.26.1 sympy 1.11.1 tokenizers 0.13.3 torch 2.0.0 tqdm 4.65.0 transformers 4.28.0.dev0 triton 2.0.0 typing_extensions 4.5.0 urllib3 1.26.15 uvicorn 0.21.1 Werkzeug 2.2.3 wheel 0.40.0 wrapt 1.15.0 zipp 3.15.0 zope.event 4.6 zope.interface 6.0

Apr 08 '23 06:04 balcklive

Can you try downgrading transformers to 4.21.0?

Apr 08 '23 07:04 JThh

Can you try downgrading transformers to 4.21.0?

I tried, another bug happend: Traceback (most recent call last): File "/home/ubuntu/./ColossalAI/applications/Chat/inference/server.py", line 10, in from llama_gptq import load_quant File "/home/ubuntu/ColossalAI/applications/Chat/inference/llama_gptq/init.py", line 1, in from .loader import load_quant File "/home/ubuntu/ColossalAI/applications/Chat/inference/llama_gptq/loader.py", line 4, in from transformers import LlamaConfig, LlamaForCausalLM ImportError: cannot import name 'LlamaConfig' from 'transformers' (/opt/conda/envs/pytorch/lib/python3.9/site-packages/transformers/init.py)

Apr 08 '23 11:04 balcklive

same problem

Apr 10 '23 08:04 gujingit

Can you try downgrading transformers to 4.21.0?

I tried, another bug happend: Traceback (most recent call last): File "/home/ubuntu/./ColossalAI/applications/Chat/inference/server.py", line 10, in from llama_gptq import load_quant File "/home/ubuntu/ColossalAI/applications/Chat/inference/llama_gptq/init.py", line 1, in from .loader import load_quant File "/home/ubuntu/ColossalAI/applications/Chat/inference/llama_gptq/loader.py", line 4, in from transformers import LlamaConfig, LlamaForCausalLM ImportError: cannot import name 'LlamaConfig' from 'transformers' (/opt/conda/envs/pytorch/lib/python3.9/site-packages/transformers/init.py)

see https://github.com/underlines/awesome-marketing-datascience/issues/2

try pip uninstall transformers and then pip installing the one with pip install git+https://github.com/zphang/transformers.git@llama_push

Apr 10 '23 08:04 gujingit

Has the problem been fixed?

Apr 13 '23 07:04 JThh

try pip uninstall transformers and then pip installing the one with pip install git+https://github.com/zphang/transformers.git@llama_push

This method fixed this issue

Nov 19 '23 08:11 CarlGao4

[BUG]:From server.py: ValueError: The following `model_kwargs` are not used by the model: ['token_type_ids']

🐛 Describe the bug

Environment