openai_api连接本地部署的ollama模型出错
问题描述 / Problem Description 用简洁明了的语言描述这个问题 / Describe the problem in a clear and concise manner.
本地部署的ollama qwen:32b,部署端口为 11434 "openai-api": { "model_name": 'qwen:32b', "api_base_url": "http://localhost:11443/v1", "api_key": "sk-xxxx", },
复现问题的步骤 / Steps to Reproduce
- python startup.py -a 输出:
==============================Langchain-Chatchat Configuration============================== 操作系统:Linux-5.15.0-105-generic-x86_64-with-glibc2.35. python版本:3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] 项目版本:v0.2.10 langchain版本:0.0.354. fastchat版本:0.2.35
当前使用的分词器:ChineseRecursiveTextSplitter 当前启动的LLM模型:['chatglm3-6b', 'openai-api', 'zhipu-api'] @ cuda {'device': 'cuda', 'host': '0.0.0.0', 'infer_turbo': False, 'model_path': 'chatglm3-6b', 'model_path_exists': True, 'port': 20002} {'api_base_url': 'http://localhost:11443/v1', 'api_key': 'sk-', 'device': 'auto', 'host': '0.0.0.0', 'infer_turbo': False, 'model_name': 'qwen:32b', 'online_api': True, 'port': 20002} {'api_key': '', 'device': 'auto', 'host': '0.0.0.0', 'infer_turbo': False, 'online_api': True, 'port': 21001, 'provider': 'ChatGLMWorker', 'version': 'glm-4', 'worker_class': <class 'server.model_workers.zhipu.ChatGLMWorker'>} 当前Embbedings模型: bge-large-zh-v1.5 @ cuda ==============================Langchain-Chatchat Configuration==============================
2024-05-16 21:01:59,787 - startup.py[line:655] - INFO: 正在启动服务: 2024-05-16 21:01:59,787 - startup.py[line:656] - INFO: 如需查看 llm_api 日志,请前往 /home/liao/exgit/Langchain-Chatchat/logs /home/liao/exgit/Langchain-Chatchat/lang/lib/python3.10/site-packages/langchain_core/_api/deprecation.py:119: LangChainDeprecationWarning: 模型启动功能将于 Langchain-Chatchat 0.3.x重写,支持更多模式和加速启动,0.2.x中相关功能将废弃 warn_deprecated( 2024-05-16 21:02:02 | INFO | model_worker | Register to controller 2024-05-16 21:02:02 | ERROR | stderr | INFO: Started server process [3474493] 2024-05-16 21:02:02 | ERROR | stderr | INFO: Waiting for application startup. 2024-05-16 21:02:02 | ERROR | stderr | INFO: Application startup complete. 2024-05-16 21:02:02 | ERROR | stderr | INFO: Uvicorn running on http://0.0.0.0:20000 (Press CTRL+C to quit) 2024-05-16 21:02:02 | INFO | model_worker | Loading the model ['chatglm3-6b'] on worker a1571fe4 ... Loading checkpoint shards: 0%| | 0/7 [00:00<?, ?it/s] 2024-05-16 21:02:02 | ERROR | stderr | /home/liao/.local/lib/python3.10/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() 2024-05-16 21:02:02 | ERROR | stderr | return self.fget.get(instance, owner)() Loading checkpoint shards: 14%|████▊ | 1/7 [00:00<00:01, 5.89it/s] Loading checkpoint shards: 29%|█████████▋ | 2/7 [00:00<00:00, 5.89it/s] Loading checkpoint shards: 43%|██████████████▌ | 3/7 [00:00<00:00, 5.91it/s] Loading checkpoint shards: 57%|███████████████████▍ | 4/7 [00:00<00:00, 6.01it/s] Loading checkpoint shards: 71%|████████████████████████▎ | 5/7 [00:00<00:00, 5.94it/s] Loading checkpoint shards: 86%|█████████████████████████████▏ | 6/7 [00:01<00:00, 5.90it/s] Loading checkpoint shards: 100%|██████████████████████████████████| 7/7 [00:01<00:00, 6.54it/s] Loading checkpoint shards: 100%|██████████████████████████████████| 7/7 [00:01<00:00, 6.18it/s] 2024-05-16 21:02:03 | ERROR | stderr | 2024-05-16 21:02:04 | INFO | model_worker | Register to controller INFO: Started server process [3474742] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:7861 (Press CTRL+C to quit)
==============================Langchain-Chatchat Configuration============================== 操作系统:Linux-5.15.0-105-generic-x86_64-with-glibc2.35. python版本:3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] 项目版本:v0.2.10 langchain版本:0.0.354. fastchat版本:0.2.35
当前使用的分词器:ChineseRecursiveTextSplitter 当前启动的LLM模型:['chatglm3-6b', 'openai-api', 'zhipu-api'] @ cuda {'device': 'cuda', 'host': '0.0.0.0', 'infer_turbo': False, 'model_path': 'chatglm3-6b', 'model_path_exists': True, 'port': 20002} {'api_base_url': 'http://localhost:11443/v1', 'api_key': 'sk-f768e176302549058cf970ce9f297aa5', 'device': 'auto', 'host': '0.0.0.0', 'infer_turbo': False, 'model_name': 'qwen:32b', 'online_api': True, 'port': 20002} {'api_key': '', 'device': 'auto', 'host': '0.0.0.0', 'infer_turbo': False, 'online_api': True, 'port': 21001, 'provider': 'ChatGLMWorker', 'version': 'glm-4', 'worker_class': <class 'server.model_workers.zhipu.ChatGLMWorker'>} 当前Embbedings模型: bge-large-zh-v1.5 @ cuda
服务端运行信息: OpenAI API Server: http://127.0.0.1:20000/v1 Chatchat API Server: http://127.0.0.1:7861 Chatchat WEBUI Server: http://0.0.0.0:7860 ==============================Langchain-Chatchat Configuration==============================
Collecting usage statistics. To deactivate, set browser.gatherUsageStats to False.
You can now view your Streamlit app in your browser.
URL: http://0.0.0.0:7860
2024-05-16 21:14:58,849 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:20001/list_models "HTTP/1.1 200 OK" INFO: 127.0.0.1:43496 - "POST /llm_model/list_running_models HTTP/1.1" 200 OK 2024-05-16 21:14:58,850 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_running_models "HTTP/1.1 200 OK" 2024-05-16 21:14:58,990 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:20001/list_models "HTTP/1.1 200 OK" INFO: 127.0.0.1:43496 - "POST /llm_model/list_running_models HTTP/1.1" 200 OK 2024-05-16 21:14:58,991 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_running_models "HTTP/1.1 200 OK" INFO: 127.0.0.1:43496 - "POST /llm_model/list_config_models HTTP/1.1" 200 OK 2024-05-16 21:14:58,994 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_config_models "HTTP/1.1 200 OK"
- 浏览器Safari打开 http://0.0.0.0:7860,选择LLM模型 openai-api,输入“hello”,回车,出错
INFO: 127.0.0.1:35520 - "POST /llm_model/get_model_config HTTP/1.1" 200 OK
2024-05-16 21:19:02,110 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/get_model_config "HTTP/1.1 200 OK"
2024-05-16 21:19:02,197 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:20001/list_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:35526 - "POST /llm_model/list_running_models HTTP/1.1" 200 OK
2024-05-16 21:19:02,198 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_running_models "HTTP/1.1 200 OK"
2024-05-16 21:19:02,224 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:20001/list_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:35526 - "POST /llm_model/list_running_models HTTP/1.1" 200 OK
2024-05-16 21:19:02,225 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_running_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:35526 - "POST /llm_model/list_config_models HTTP/1.1" 200 OK
2024-05-16 21:19:02,230 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_config_models "HTTP/1.1 200 OK"
2024-05-16 21:19:08,080 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:20001/list_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:35528 - "POST /llm_model/list_running_models HTTP/1.1" 200 OK
2024-05-16 21:19:08,081 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_running_models "HTTP/1.1 200 OK"
2024-05-16 21:19:08,107 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:20001/list_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:35528 - "POST /llm_model/list_running_models HTTP/1.1" 200 OK
2024-05-16 21:19:08,108 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_running_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:35528 - "POST /llm_model/list_config_models HTTP/1.1" 200 OK
2024-05-16 21:19:08,111 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/llm_model/list_config_models "HTTP/1.1 200 OK"
INFO: 127.0.0.1:35528 - "POST /chat/chat HTTP/1.1" 200 OK
/home/liao/exgit/Langchain-Chatchat/lang/lib/python3.10/site-packages/langchain_core/_api/deprecation.py:119: LangChainDeprecationWarning: The class ChatOpenAI was deprecated in LangChain 0.0.10 and will be removed in 0.2.0. An updated version of the class exists in the langchain-openai package and should be used instead. To use it run pip install -U langchain-openai and import as from langchain_openai import ChatOpenAI.
warn_deprecated(
2024-05-16 21:19:08,304 - _client.py[line:1027] - INFO: HTTP Request: POST http://127.0.0.1:7861/chat/chat "HTTP/1.1 200 OK"
2024-05-16 21:19:08,314 - _base_client.py[line:1611] - INFO: Retrying request to /chat/completions in 0.779848 seconds
2024-05-16 21:19:09,097 - _base_client.py[line:1611] - INFO: Retrying request to /chat/completions in 1.693247 seconds
2024-05-16 21:19:10,794 - utils.py[line:38] - ERROR: Connection error.
环境信息 / Environment Information
最新版本,非docker,ollama和langchain_chatchat部署在ubuntu 2204,4090x2
ollama上的openai-api正常
浏览器为Mac safari
另外,为什么智谱的模型和API是running,其他不是,在哪配置的吗,没找到
你好,@liaoweiguo!👋 我是Dosu,一个可以帮助你解决bug、回答问题并成为贡献者的聊天机器人。在等待人类维护者的同时,我在这里帮助你。我现在就来看看你的问题,并将很快提供完整的答案。🚀
ONLINE_LLM_MODEL = { "dolphin-llama3": { #ollama "api_base_url": "http://localhost:11434/v1", "options": 'num_predict 4098'#传递给ollama模型,是PARAMETER参数 }, "qwen:7b-chat-v1.5-q6_K": { #ollama 85% w noise "api_base_url": "http://localhost:11434/v1", "options": 'num_predict 4098'#传递给ollama模型,是PARAMETER参数
这个问题已经被标记为 stale ,因为它已经超过 30 天没有任何活动。