api-for-open-llm
api-for-open-llm copied to clipboard
Qwen1.5推理报错RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous
提交前必须检查以下项目 | The following items must be checked before submission
- [X] 请确保使用的是仓库最新代码(git pull),一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
- [X] 我已阅读项目文档和FAQ章节并且已在Issue中对问题进行了搜索,没有找到相似问题和解决方案 | I have searched the existing issues / discussions
问题类型 | Type of problem
模型推理和部署 | Model inference and deployment
操作系统 | Operating system
Windows
详细描述问题 | Detailed description of the problem
# 请在此处粘贴运行代码(如没有可删除该代码块)
# Paste the runtime code here (delete the code block if you don't have it)
PORT=8053
# model related
MODEL_NAME=qwen2
MODEL_PATH=D:/projects/Qwen/models/Qwen1.5-14B-Chat
PROMPT_NAME=qwen2
EMBEDDING_NAME=D:/projects/Qwen/models/m3e-base
ADAPTER_MODEL_PATH=
QUANTIZE=16
CONTEXT_LEN=1200
LOAD_IN_8BIT=false
LOAD_IN_4BIT=false
USING_PTUNING_V2=false
STREAM_INTERVERL=2
# device related
DEVICE=cuda
# "auto", "cuda:0", "cuda:1", ...
DEVICE_MAP=auto
GPUS=
NUM_GPUs=1
DTYPE=half
# api related
API_PREFIX=/v1
USE_STREAMER_V2=false
ENGINE=default
Dependencies
# 请在此处粘贴依赖情况
# Please paste the dependencies here
transformers-4.38.1
其他都与项目requirements一致
运行日志或截图 | Runtime logs or screenshots
# 请在此处粘贴运行日志
# Please paste the run log here
INFO: 47.90.164.232:0 - "POST /v1/chat/completions HTTP/1.1" 200 OK
Traceback (most recent call last):
File "D:\projects\api-for-open-llm\api-for-open-llm\api\core\default.py", line 281, in _generate
for output in self.generate_stream_func(self.model, self.tokenizer, params):
File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\torch\utils\_contextlib.py", line 35, in generator_context
response = gen.send(None)
^^^^^^^^^^^^^^
File "D:\projects\api-for-open-llm\api-for-open-llm\api\generation\stream.py", line 85, in generate_stream
out = model(torch.as_tensor([input_ids], device=device), use_cache=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\accelerate\hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\transformers\models\qwen2\modeling_qwen2.py", line 1173, in forward
outputs = self.model(
^^^^^^^^^^^
File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\projects\api-for-open-llm\llm_env\Lib\site-packages\transformers\models\qwen2\modeling_qwen2.py", line 998, in forward
position_ids = position_ids.unsqueeze(0).view(-1, seq_length)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous
试试USE_STREAMER_V2=true
试试USE_STREAMER_V2=true
不行,还是报这个错
请问目前解决这个问题了吗?
qwen2 也是这个错误
有示例吗,我这里复现不了这个错误
换了USE_STREAMER_V2=true 好像好了,暂时没出现了