Langchain-Chatchat 无法找到xinference中自定义的模型，并且提问出错

无法找到xinference中自定义的模型，并且提问出错

Open kydbj opened this issue 7 months ago • 3 comments

问题描述 / Problem Description 无法找到xinference中自定义的模型，并且提问出错

复现问题的步骤 / Steps to Reproduce

执行 'xinference-local --host 0.0.0.0 --port 9997' 启动 xinference，注册了 bge-large-zh-lacal 和 glm4-local 两个模型，并将两个模型启动
执行 'chatchat init' ，修改了两个配置文件：

basic_settings.yaml

# 服务器基本配置信息
# 除 log_verbose/HTTPX_DEFAULT_TIMEOUT 修改后即时生效
# 其它配置项修改后都需要重启服务器才能生效，服务运行期间请勿修改


# 生成该配置模板的项目代码版本，如这里的值与程序实际版本不一致，建议重建配置文件模板
version: 0.3.1.2

# 是否开启日志详细信息
log_verbose: false

# httpx 请求默认超时时间（秒）。如果加载模型或对话较慢，出现超时错误，可以适当加大该值。
HTTPX_DEFAULT_TIMEOUT: 300.0

# 知识库默认存储路径
KB_ROOT_PATH: /mnt/add_disk/Langchain-Chatchat/data/knowledge_base

# 数据库默认存储路径。如果使用sqlite，可以直接修改DB_ROOT_PATH；如果使用其它数据库，请直接修改SQLALCHEMY_DATABASE_URI。
DB_ROOT_PATH: /mnt/add_disk/Langchain-Chatchat/data/knowledge_base/info.db

# 知识库信息数据库连接URI
SQLALCHEMY_DATABASE_URI: sqlite:////mnt/add_disk/Langchain-Chatchat/data/knowledge_base/info.db

# API 是否开启跨域
OPEN_CROSS_DOMAIN: false

# 各服务器默认绑定host。如改为"0.0.0.0"需要修改下方所有XX_SERVER的host
# Windows 下 WEBUI 自动弹出浏览器时，如果地址为 "0.0.0.0" 是无法访问的，需要手动修改地址栏
DEFAULT_BIND_HOST: 0.0.0.0

# API 服务器地址。其中 public_host 用于生成云服务公网访问链接（如知识库文档链接）
API_SERVER:
  host: 0.0.0.0
  port: 7861
  public_host: 127.0.0.1
  public_port: 7861

# WEBUI 服务器地址
WEBUI_SERVER:
  host: 0.0.0.0
  port: 8501

model_settings.yaml

# 模型配置项


# 默认选用的 LLM 名称
DEFAULT_LLM_MODEL: glm4-local

# 默认选用的 Embedding 名称
DEFAULT_EMBEDDING_MODEL: bge-large-zh-lacal

# AgentLM模型的名称 (可以不指定，指定之后就锁定进入Agent之后的Chain的模型，不指定就是 DEFAULT_LLM_MODEL)
Agent_MODEL: ''

# 默认历史对话轮数
HISTORY_LEN: 3

# 大模型最长支持的长度，如果不填写，则使用模型默认的最大长度，如果填写，则为用户设定的最大长度
MAX_TOKENS:

# LLM通用对话参数
TEMPERATURE: 0.7

# 支持的Agent模型
SUPPORT_AGENT_MODELS:
  - chatglm3-6b
  - glm-4
  - openai-api
  - Qwen-2
  - qwen2-instruct
  - gpt-3.5-turbo
  - gpt-4o

# LLM模型配置，包括了不同模态初始化参数。
# `model` 如果留空则自动使用 DEFAULT_LLM_MODEL
LLM_MODEL_CONFIG:
  preprocess_model:
    model: ''
    temperature: 0.05
    max_tokens: 4096
    history_len: 10
    prompt_name: default
    callbacks: false
  llm_model:
    model: ''
    temperature: 0.9
    max_tokens: 4096
    history_len: 10
    prompt_name: default
    callbacks: true
  action_model:
    model: ''
    temperature: 0.01
    max_tokens: 4096
    history_len: 10
    prompt_name: ChatGLM3
    callbacks: true
  postprocess_model:
    model: ''
    temperature: 0.01
    max_tokens: 4096
    history_len: 10
    prompt_name: default
    callbacks: true
  image_model:
    model: sd-turbo
    size: 256*256

# # 模型加载平台配置


# # 平台名称
# platform_name: xinference

# # 平台类型
# # 可选值：['xinference', 'ollama', 'oneapi', 'fastchat', 'openai', 'custom openai']
# platform_type: xinference

# # openai api url
# api_base_url: http://127.0.0.1:9997/v1

# # api key if available
# api_key: EMPTY

# # API 代理
# api_proxy: ''

# # 该平台单模型最大并发数
# api_concurrencies: 5

# # 是否自动获取平台可用模型列表。设为 True 时下方不同模型类型可自动检测
# auto_detect_model: false

# # 该平台支持的大语言模型列表，auto_detect_model 设为 True 时自动检测
# llm_models: []

# # 该平台支持的嵌入模型列表，auto_detect_model 设为 True 时自动检测
# embed_models: []

# # 该平台支持的图像生成模型列表，auto_detect_model 设为 True 时自动检测
# text2image_models: []

# # 该平台支持的多模态模型列表，auto_detect_model 设为 True 时自动检测
# image2text_models: []

# # 该平台支持的重排模型列表，auto_detect_model 设为 True 时自动检测
# rerank_models: []

# # 该平台支持的 STT 模型列表，auto_detect_model 设为 True 时自动检测
# speech2text_models: []

# # 该平台支持的 TTS 模型列表，auto_detect_model 设为 True 时自动检测
# text2speech_models: []
MODEL_PLATFORMS:
  - platform_name: xinference
    platform_type: xinference
    api_base_url: http://127.0.0.1:9997/v1
    api_key: EMPTY
    api_proxy: ''
    api_concurrencies: 5
    auto_detect_model: true
    llm_models: [glm4-local]
    embed_models: [bge-large-zh-lacal]
    text2image_models: []
    image2text_models: []
    rerank_models: []
    speech2text_models: []
    text2speech_models: []
  - platform_name: ollama
    platform_type: ollama
    api_base_url: http://127.0.0.1:11434/v1
    api_key: EMPTY
    api_proxy: ''
    api_concurrencies: 5
    auto_detect_model: false
    llm_models:
      - qwen:7b
      - qwen2:7b
    embed_models:
      - quentinz/bge-large-zh-v1.5
    text2image_models: []
    image2text_models: []
    rerank_models: []
    speech2text_models: []
    text2speech_models: []
  - platform_name: oneapi
    platform_type: oneapi
    api_base_url: http://127.0.0.1:3000/v1
    api_key: sk-
    api_proxy: ''
    api_concurrencies: 5
    auto_detect_model: false
    llm_models:
      - chatglm_pro
      - chatglm_turbo
      - chatglm_std
      - chatglm_lite
      - qwen-turbo
      - qwen-plus
      - qwen-max
      - qwen-max-longcontext
      - ERNIE-Bot
      - ERNIE-Bot-turbo
      - ERNIE-Bot-4
      - SparkDesk
    embed_models:
      - text-embedding-v1
      - Embedding-V1
    text2image_models: []
    image2text_models: []
    rerank_models: []
    speech2text_models: []
    text2speech_models: []
  - platform_name: openai
    platform_type: openai
    api_base_url: https://api.openai.com/v1
    api_key: sk-proj-
    api_proxy: ''
    api_concurrencies: 5
    auto_detect_model: false
    llm_models:
      - gpt-4o
      - gpt-3.5-turbo
    embed_models:
      - text-embedding-3-small
      - text-embedding-3-large
    text2image_models: []
    image2text_models: []
    rerank_models: []
    speech2text_models: []
    text2speech_models: []

3、执行 chatchat start -r，并提问

预期的结果 / Expected Result 有正常回答

实际结果 / Actual Result 执行 chatchat start -r 后：

2024-07-22 01:24:56.469 | WARNING  | chatchat.server.utils:get_default_embedding:214 - default embedding model bge-large-zh-lacal is not found in available embeddings, using quentinz/bge-large-zh-v1.5 instead
2024-07-22 01:24:56.506 | WARNING  | chatchat.server.utils:get_default_embedding:214 - default embedding model bge-large-zh-lacal is not found in available embeddings, using quentinz/bge-large-zh-v1.5 instead
2024-07-22 01:24:56.540 | WARNING  | chatchat.server.utils:get_default_embedding:214 - default embedding model bge-large-zh-lacal is not found in available embeddings, using quentinz/bge-large-zh-v1.5 instead
2024-07-22 01:24:57.539 | WARNING  | chatchat.server.utils:get_default_llm:205 - default llm model glm4-local is not found in available llms, using qwen:7b instead
2024-07-22 01:24:57.776 | WARNING  | chatchat.server.utils:get_default_llm:205 - default llm model glm4-local is not found in available llms, using qwen:7b instead

在web界面提问，报错：

2024-07-22 01:42:46.424 | ERROR    | chatchat.server.api_server.openai_routes:generator:105 - openai request error: Connection error.

环境信息 / Environment Information

langchain-ChatGLM 版本/commit 号：0.3.1.2
是否使用 Docker 部署（是/否）：no
使用的模型（ChatGLM2-6B / Qwen-7B 等）：glm-4-chat-local
使用的 Embedding 模型（moka-ai/m3e-base 等）：bge-large-zh-local
使用的向量库类型 (faiss / milvus / pg_vector 等)： faiss
操作系统及版本 / Operating system and version: linux
Python 版本 / Python version: python 3.11.9

Jul 22 '24 02:07 kydbj

Langchain-Chatchat Langchain-Chatchat copied to clipboard

无法找到xinference中自定义的模型，并且提问出错

Langchain-Chatchat
Langchain-Chatchat copied to clipboard