Qwen2.5
Qwen2.5 copied to clipboard
[BUG] 运行openai_api.py报错,cli_demo.py和web_demo.py均正常启动
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
- [X] 我已经搜索过FAQ | I have searched FAQ
当前行为 | Current Behavior
(Qwen) [root@ECS-AIServer2 Qwen]# python openai_api.py Warning: please make sure that you are using the latest codes and checkpoints, especially if you used Qwen-7B before 09.25.2023.请使用最新模型和代码,尤其如果你在9月25日前已经开始使用Qwen-7B,千万注意不要使用错误代码和模型。 The model is automatically converting to fp16 for faster inference. If you want to disable the automatic precision, please manually add bf16/fp16/fp32=True to "AutoModelForCausalLM.from_pretrained". Try importing flash-attention for faster inference... Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary Warning: import flash_attn rms_norm fail, please install FlashAttention layer_norm to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm Warning: import flash_attn fail, please install FlashAttention to get higher efficiency https://github.com/Dao-AILab/flash-attention Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:10<00:00, 1.43it/s] INFO: Started server process [9010] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:7788 (Press CTRL+C to quit) INFO: 10.10.10.88:5073 - "POST /v1/chat/completions HTTP/1.1" 200 OK ERROR: Exception in ASGI application Traceback (most recent call last): File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/sse_starlette/sse.py", line 281, in call await wrap(partial(self.listen_for_disconnect, receive)) File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/sse_starlette/sse.py", line 270, in wrap await func() File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/sse_starlette/sse.py", line 221, in listen_for_disconnect message = await receive() File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 542, in receive await self.message_event.wait() File "/root/anaconda3/envs/Qwen/lib/python3.10/asyncio/locks.py", line 214, in wait await fut asyncio.exceptions.CancelledError: Cancelled by cancel scope 7f6c701737c0
During handling of the above exception, another exception occurred:
- Exception Group Traceback (most recent call last): | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 412, in run_asgi | result = await app( # type: ignore[func-returns-value] | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in call | return await self.app(scope, receive, send) | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in call | await super().call(scope, receive, send) | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/starlette/applications.py", line 123, in call | await self.middleware_stack(scope, receive, send) | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in call | raise exc | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in call | await self.app(scope, receive, _send) | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/starlette/middleware/cors.py", line 83, in call | await self.app(scope, receive, send) | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in call | await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send) | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app | raise exc | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app | await app(scope, receive, sender) | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/starlette/routing.py", line 758, in call | await self.middleware_stack(scope, receive, send) | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/starlette/routing.py", line 778, in app | await route.handle(scope, receive, send) | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/starlette/routing.py", line 299, in handle | await self.app(scope, receive, send) | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/starlette/routing.py", line 79, in app | await wrap_app_handling_exceptions(app, request)(scope, receive, send) | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app | raise exc | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app | await app(scope, receive, sender) | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/starlette/routing.py", line 77, in app | await response(scope, receive, send) | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/sse_starlette/sse.py", line 267, in call | async with anyio.create_task_group() as task_group: | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 678, in aexit | raise BaseExceptionGroup( | exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception) +-+---------------- 1 ---------------- | Traceback (most recent call last): | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/sse_starlette/sse.py", line 270, in wrap | await func() | File "/root/anaconda3/envs/Qwen/lib/python3.10/site-packages/sse_starlette/sse.py", line 251, in stream_response | async for data in self.body_iterator: | File "/home/package/Qwen/openai_api.py", line 487, in predict | delay_token_num = max([len(x) for x in stop_words]) | TypeError: 'NoneType' object is not iterable +------------------------------------
期望行为 | Expected Behavior
No response
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
- OS:CentOS 7
- Python:3.10.13
- Transformers:4.32.0
- PyTorch:2.0.1
- CUDA:11.7
- pydantic:2.6.2
Package Version
----------------------------- ----------
accelerate 0.27.2
addict 2.4.0
aiofiles 23.2.1
aiohttp 3.9.3
aiosignal 1.3.1
aliyun-python-sdk-core 2.14.0
aliyun-python-sdk-kms 2.16.2
altair 5.2.0
annotated-types 0.6.0
anyio 4.3.0
async-timeout 4.0.3
attrs 23.2.0
bitsandbytes 0.42.0
blinker 1.7.0
Brotli 1.0.9
cachetools 5.3.2
certifi 2024.2.2
cffi 1.16.0
charset-normalizer 2.0.4
click 8.1.7
colorama 0.4.6
contourpy 1.2.0
cpm-kernels 1.0.11
crcmod 1.7
cryptography 42.0.4
cycler 0.12.1
datasets 2.17.1
dill 0.3.8
einops 0.7.0
exceptiongroup 1.2.0
fastapi 0.109.2
ffmpy 0.3.2
filelock 3.13.1
fonttools 4.49.0
frozenlist 1.4.1
fsspec 2023.10.0
gast 0.5.4
gitdb 4.0.11
GitPython 3.1.42
gmpy2 2.1.2
gradio 3.41.2
gradio_client 0.5.0
h11 0.14.0
httpcore 1.0.3
httpx 0.26.0
huggingface-hub 0.20.3
idna 3.4
importlib-metadata 7.0.1
importlib-resources 6.1.1
Jinja2 3.1.3
jmespath 0.10.0
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
latex2mathml 3.77.0
Markdown 3.5.2
markdown-it-py 3.0.0
MarkupSafe 2.1.3
matplotlib 3.8.3
mdtex2html 1.3.0
mdurl 0.1.2
mkl-fft 1.3.8
mkl-random 1.2.4
mkl-service 2.4.0
modelscope 1.12.0
mpmath 1.3.0
multidict 6.0.5
multiprocess 0.70.16
networkx 3.1
numpy 1.26.3
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.19.3
nvidia-nvjitlink-cu12 12.3.101
nvidia-nvtx-cu12 12.1.105
openai 0.28.1
orjson 3.9.14
oss2 2.18.4
packaging 23.2
pandas 2.2.0
pillow 10.2.0
pip 23.3.1
platformdirs 4.2.0
protobuf 4.25.3
psutil 5.9.8
pyarrow 15.0.0
pyarrow-hotfix 0.6
pycparser 2.21
pycryptodome 3.20.0
pydantic 2.6.2
pydantic_core 2.16.3
pydeck 0.8.1b0
pydub 0.25.1
Pygments 2.17.2
pyparsing 3.1.1
PySocks 1.7.1
python-dateutil 2.8.2
python-multipart 0.0.9
pytz 2024.1
PyYAML 6.0.1
referencing 0.33.0
regex 2023.12.25
requests 2.31.0
rich 13.7.0
rpds-py 0.18.0
safetensors 0.4.2
scipy 1.12.0
semantic-version 2.10.0
sentencepiece 0.2.0
setuptools 68.2.2
simplejson 3.19.2
six 1.16.0
smmap 5.0.1
sniffio 1.3.0
sortedcontainers 2.4.0
sse-starlette 2.0.0
starlette 0.36.3
streamlit 1.31.1
sympy 1.12
tenacity 8.2.3
tiktoken 0.6.0
tokenizers 0.13.3
toml 0.10.2
tomli 2.0.1
toolz 0.12.1
torch 2.0.1
torchaudio 2.0.2
torchvision 0.15.2
tornado 6.4
tqdm 4.66.2
transformers 4.32.0
transformers-stream-generator 0.0.4
triton 2.0.0
typing_extensions 4.9.0
tzdata 2024.1
tzlocal 5.2
urllib3 2.1.0
uvicorn 0.27.1
validators 0.22.0
watchdog 4.0.0
websockets 11.0.3
wheel 0.41.2
xformers 0.0.24
xxhash 3.4.1
yapf 0.40.2
yarl 1.9.4
zipp 3.17.0
备注 | Anything else?
查阅过330,330解决方案是提升pydantic版本,以确认pydantic版本是2.6.2,非1.X版本,但是仍然报错。 查阅437,报错好像一致,但是没有看到解决方案,因为CUDA只能是11.7,所以Pytorch只能到2.0.1,我的依赖应该都是最新的,详见[运行环境 | Environment](运行环境 | Environment) 其次通过cli_demo.py和web_demo.py均正常启动,唯独openai_api.py报错,期待得到解决方案,谢谢!
好像找到问题了,不知道是不是最优解决,原因是stop_words是None,那既然None没有办法去循环会出现异常,那就加一个前置判断。
if stop_words:
delay_token_num = max([len(x) for x in stop_words])
else:
delay_token_num = 0
这样就不报错了。在源码487行的位置
是qwen1.5么?
是qwen1.5么?
7b和14b都这样
按照这个方法解决了issue中提出的问题后,又出现了RuntimeError: probability tensor contains either inf
, nan
or element < 0的报错
按照这个方法解决了issue中提出的问题后,又出现了RuntimeError: probability tensor contains either
inf
,nan
or element < 0的报错
原来你看过我帖子,我补一个解决方案
解决了上述问题之后,好像就没有问题,在postman中或者api接口都可以正常请求
但是过第三方接口,例如ONE API的时候,就会报错
RuntimeError: probability tensor contains either inf, nan or element < 0
查看了931之后,官方也表示,修改了temperature为0后,这个是预期现象,这个问题主@pengbj还是个好人,测出来了,只要小于等于0.5就会出现这个情况 ,转念一想,这个temperature与我何干,不让他小于0.5就好了,于是继续修改/openai_api.py源码,限制温度不要小于0.51
gen_kwargs['temperature'] = request.temperature
if gen_kwargs['temperature'] < 0.51:
gen_kwargs['temperature'] = 0.51
在源码397行
你好老师,我想问下我在文件目录中修改了参数地址为我qlora微调过后的地址,然后直接跑官方那个api使用程序,但是直接报错
是怎么回是呢
nice,亲测可行
是qwen1.5么?
7b和14b都这样
qwen v1.5不是不支持这个openai_api方法吗?不支持chat方法,用的是transformer的那套接口
是qwen1.5么?
7b和14b都这样
qwen v1.5不是不支持这个openai_api方法吗?不支持chat方法,用的是transformer的那套接口
我也不是很懂,为什么大佬给他挪过来 @jklj077