Langchain-Chatchat icon indicating copy to clipboard operation
Langchain-Chatchat copied to clipboard

server_config.py 开启"infer_turbo": 'vllm'模式加载chatglm3-6b启动报错

Open Andy1018 opened this issue 11 months ago • 6 comments

2024-03-13 09:51:31,055 - startup.py[line:655] - INFO: 正在启动服务: 2024-03-13 09:51:31,055 - startup.py[line:656] - INFO: 如需查看 llm_api 日志,请前往 /home/chatglm3/Langchain-Chatchat/logs

==============================Langchain-Chatchat Configuration============================== 操作系统:Linux-3.10.0-1160.108.1.el7.x86_64-x86_64-with-glibc2.17. python版本:3.11.7 (main, Dec 15 2023, 18:12:31) [GCC 11.2.0] 项目版本:v0.2.10 langchain版本:0.0.354. fastchat版本:0.2.35

当前使用的分词器:ChineseRecursiveTextSplitter 当前启动的LLM模型:['chatglm3-6b'] @ cuda {'device': 'cuda', 'gpus': '0,1', 'host': '0.0.0.0', 'infer_turbo': 'vllm', 'limit_worker_concurrency': 20, 'max_gpu_memory': '22GiB', 'model_path': '/home/chatglm3/Langchain-Chatchat/chatglm3-6b', 'model_path_exists': True, 'num_gpus': 2, 'port': 20002} 当前Embbedings模型: bge-large-zh-v1.5 @ cuda ==============================Langchain-Chatchat Configuration==============================

/home/user/anaconda3/envs/langchai-chat/lib/python3.11/site-packages/langchain_core/_api/deprecation.py:117: LangChainDeprecationWarning: 模型启动功能将于 Langchain-Chatchat 0.3.x重写,支持更多模式和加速启动,0.2.x中相关功能将废弃 warn_deprecated( 2024-03-13 09:51:36 | ERROR | stderr | INFO: Started server process [810671] 2024-03-13 09:51:36 | ERROR | stderr | INFO: Waiting for application startup. 2024-03-13 09:51:36 | ERROR | stderr | INFO: Application startup complete. 2024-03-13 09:51:36 | ERROR | stderr | INFO: Uvicorn running on http://0.0.0.0:20000 (Press CTRL+C to quit) Process model_worker - chatglm3-6b: Traceback (most recent call last): File "/home/user/anaconda3/envs/langchai-chat/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/home/user/anaconda3/envs/langchai-chat/lib/python3.11/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/chatglm3/Langchain-Chatchat/startup.py", line 389, in run_model_worker app = create_model_worker_app(log_level=log_level, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/chatglm3/Langchain-Chatchat/startup.py", line 109, in create_model_worker_app import fastchat.serve.vllm_worker File "/home/user/anaconda3/envs/langchai-chat/lib/python3.11/site-packages/fastchat/serve/vllm_worker.py", line 15, in from vllm import AsyncLLMEngine File "/home/user/anaconda3/envs/langchai-chat/lib/python3.11/site-packages/vllm/init.py", line 3, in from vllm.engine.arg_utils import AsyncEngineArgs, EngineArgs File "/home/user/anaconda3/envs/langchai-chat/lib/python3.11/site-packages/vllm/engine/arg_utils.py", line 6, in from vllm.config import (CacheConfig, ModelConfig, ParallelConfig, File "/home/user/anaconda3/envs/langchai-chat/lib/python3.11/site-packages/vllm/config.py", line 9, in from vllm.utils import get_cpu_memory, is_hip File "/home/user/anaconda3/envs/langchai-chat/lib/python3.11/site-packages/vllm/utils.py", line 11, in from vllm._C import cuda_utils ImportError: /home/user/anaconda3/envs/langchai-chat/lib/python3.11/site-packages/vllm/_C.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops15to_dtype_layout4callERKNS_6TensorEN3c108optionalINS5_10ScalarTypeEEENS6_INS5_6LayoutEEENS6_INS5_6DeviceEEENS6_IbEEbbNS6_INS5_12MemoryFormatEEE

Andy1018 avatar Mar 13 '24 01:03 Andy1018

嘿,@Andy1018,又见面了!遇到新挑战了吗?希望这次我们能一起顺利解决。

您好,Andy1018,根据您提供的错误日志,问题出在vllm模块的导入上,具体来说,是在尝试从vllm导入AsyncLLMEngine时出现了问题。这可能是由于vllm模块或其依赖项没有正确安装或配置。

在Langchain-Chatchat的issue中,我找到了一些类似的问题,可能对您有所帮助:

  1. [v0.2.9]使用vllm加速原版Qwen-7B-Chat和chatglm2-6b均报错,上个版本(v0.2.8)没有这个情况 这个问题的解决方案是在vllm配置中添加以下四行:
"max_parallel_loading_workers":1,
"enforce_eager":False,
"max_context_len_to_capture":2048,
"max_model_len":2048
  1. 启动错误 这个问题的解决方案是将streamlit添加到环境变量。

另外,'infer_turbo': 'vllm'模式的目的是使用特定的推理加速框架,即"vllm",进行模型推理。这种模式旨在通过加速过程来提高模型推理操作的性能。然而,使用"vllm"可能会导致与HuggingFace的通信问题,而且一些模型对"vllm"的支持还不成熟,这就是为什么默认情况下它是禁用的("infer_turbo": False)。

create_model_worker_app函数在startup.py文件中的作用是设置和配置一个FastAPI应用,作为模型工作器。这个函数根据不同类型的模型的需要,使用关键字参数(kwargs)来配置模型工作器,包括Langchain支持的模型、在线API模型和离线模型。

希望这些信息对您有所帮助。如果您有任何其他问题或需要进一步的帮助,请随时告诉我。

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] avatar Mar 13 '24 01:03 dosubot[bot]

此问题为解决,请提供解决方案

Andy1018 avatar Mar 13 '24 02:03 Andy1018

@dosu-bot 此解决方案不对

Andy1018 avatar Mar 13 '24 02:03 Andy1018

很抱歉听到您遇到问题。请详细描述问题,我会尽力提供帮助。

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] avatar Mar 13 '24 02:03 dosubot[bot]

@dosu-bot 问题就是server_config.py 开启"infer_turbo": 'vllm'设置,加载的模型是chatglm3-6b, startup.py -a启动报错,ImportError: /home/user/anaconda3/envs/langchai-chat/lib/python3.11/site-packages/vllm/_C.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops15to_dtype_layout4callERKNS_6TensorEN3c108optionalINS5_10ScalarTypeEEENS6_INS5_6LayoutEEENS6_INS5_6DeviceEEENS6_IbEEbbNS6_INS5_12MemoryFormatEEE以下是打印的详细日志

24-03-13 09:51:31,055 - startup.py[line:655] - INFO: 正在启动服务: 2024-03-13 09:51:31,055 - startup.py[line:656] - INFO: 如需查看 llm_api 日志,请前往 /home/chatglm3/Langchain-Chatchat/logs

==============================Langchain-Chatchat Configuration============================== 操作系统:Linux-3.10.0-1160.108.1.el7.x86_64-x86_64-with-glibc2.17. python版本:3.11.7 (main, Dec 15 2023, 18:12:31) [GCC 11.2.0] 项目版本:v0.2.10 langchain版本:0.0.354. fastchat版本:0.2.35

当前使用的分词器:ChineseRecursiveTextSplitter 当前启动的LLM模型:['chatglm3-6b'] @ cuda {'device': 'cuda', 'gpus': '0,1', 'host': '0.0.0.0', 'infer_turbo': 'vllm', 'limit_worker_concurrency': 20, 'max_gpu_memory': '22GiB', 'model_path': '/home/chatglm3/Langchain-Chatchat/chatglm3-6b', 'model_path_exists': True, 'num_gpus': 2, 'port': 20002} 当前Embbedings模型: bge-large-zh-v1.5 @ cuda ==============================Langchain-Chatchat Configuration==============================

/home/user/anaconda3/envs/langchai-chat/lib/python3.11/site-packages/langchain_core/_api/deprecation.py:117: LangChainDeprecationWarning: 模型启动功能将于 Langchain-Chatchat 0.3.x重写,支持更多模式和加速启动,0.2.x中相关功能将废弃 warn_deprecated( 2024-03-13 09:51:36 | ERROR | stderr | INFO: Started server process [810671] 2024-03-13 09:51:36 | ERROR | stderr | INFO: Waiting for application startup. 2024-03-13 09:51:36 | ERROR | stderr | INFO: Application startup complete. 2024-03-13 09:51:36 | ERROR | stderr | INFO: Uvicorn running on http://0.0.0.0:20000/ (Press CTRL+C to quit) Process model_worker - chatglm3-6b: Traceback (most recent call last): File "/home/user/anaconda3/envs/langchai-chat/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/home/user/anaconda3/envs/langchai-chat/lib/python3.11/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/chatglm3/Langchain-Chatchat/startup.py", line 389, in run_model_worker app = create_model_worker_app(log_level=log_level, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/chatglm3/Langchain-Chatchat/startup.py", line 109, in create_model_worker_app import fastchat.serve.vllm_worker File "/home/user/anaconda3/envs/langchai-chat/lib/python3.11/site-packages/fastchat/serve/vllm_worker.py", line 15, in from vllm import AsyncLLMEngine File "/home/user/anaconda3/envs/langchai-chat/lib/python3.11/site-packages/vllm/init.py", line 3, in from vllm.engine.arg_utils import AsyncEngineArgs, EngineArgs File "/home/user/anaconda3/envs/langchai-chat/lib/python3.11/site-packages/vllm/engine/arg_utils.py", line 6, in from vllm.config import (CacheConfig, ModelConfig, ParallelConfig, File "/home/user/anaconda3/envs/langchai-chat/lib/python3.11/site-packages/vllm/config.py", line 9, in from vllm.utils import get_cpu_memory, is_hip File "/home/user/anaconda3/envs/langchai-chat/lib/python3.11/site-packages/vllm/utils.py", line 11, in from vllm._C import cuda_utils ImportError: /home/user/anaconda3/envs/langchai-chat/lib/python3.11/site-packages/vllm/_C.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops15to_dtype_layout4callERKNS_6TensorEN3c108optionalINS5_10ScalarTypeEEENS6_INS5_6LayoutEEENS6_INS5_6DeviceEEENS6_IbEEbbNS6_INS5_12MemoryFormatEEE

Andy1018 avatar Mar 13 '24 02:03 Andy1018

Based on the error message you provided, it seems that the issue is related to the vllm library, specifically the _C.cpython-311-x86_64-linux-gnu.so shared object file. The error message indicates that there's an undefined symbol, which usually means that the shared object file is trying to use a symbol (a variable or function) that it can't find.

This could be due to several reasons:

  1. Incompatible versions: The vllm library might be incompatible with the version of Python or PyTorch you're using. You're using Python 3.11.7, which is relatively new, and there might be compatibility issues with the library. You could try downgrading Python to an earlier version, like 3.8 or 3.9, and see if the issue persists.

  2. Missing dependencies: The vllm library might depend on other libraries that are not installed or not correctly installed in your environment. You should ensure that all required dependencies are installed. If vllm is a custom or third-party library, make sure it's correctly placed in your project's directory or installed in the Python environment.

  3. Corrupted or incomplete installation: The vllm library might not be correctly installed. This could happen if the installation process was interrupted or if there was an issue with the source code. You could try reinstalling the library and see if that resolves the issue.

Unfortunately, I couldn't find any specific information about the vllm mode or the chatglm3-6b model in the Langchain-Chatchat repository. If the issue persists after trying the above steps, could you provide more details about where the vllm library comes from and how it's supposed to be used? This would help in providing a more targeted solution.

Regarding similar issues in the repository, I found a few that might be related:

I hope this helps! If you have any more questions or need further clarification, feel free to ask.

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] avatar Mar 13 '24 02:03 dosubot[bot]

这个问题已经被标记为 stale ,因为它已经超过 30 天没有任何活动。

github-actions[bot] avatar Apr 12 '24 21:04 github-actions[bot]