inference icon indicating copy to clipboard operation
inference copied to clipboard

embedding显存增加导致模型掉线

Open linqingxu opened this issue 1 year ago • 3 comments

System Info / 系統信息

xinference v0.15.1(实际上从0.14.0开始一直存在)显卡是A40

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

  • [X] docker / docker
  • [ ] pip install / 通过 pip install 安装
  • [ ] installation from source / 从源码安装

Version info / 版本信息

0.15.1

The command used to start Xinference / 用以启动 xinference 的命令

docker run

Reproduction / 复现过程

  1. 上线embedding模型(我使用的是bge-m3,chunk_size是1k)
  2. 添加到向量数据库(调用api)
  3. cuda out of memory

Expected behavior / 期待表现

embedding显存释放,稳定占用显存

linqingxu avatar Sep 19 '24 09:09 linqingxu

bge-m3和bge-reranker-v2-m3共用一张显卡,最多时显存会飙升到55G(本身显卡只有46G显存),平常bge-m3占用不到3G

linqingxu avatar Sep 19 '24 09:09 linqingxu

也遇到了这个问题,有没有解决办法?

deific avatar Sep 26 '24 07:09 deific

This issue is stale because it has been open for 7 days with no activity.

github-actions[bot] avatar Oct 03 '24 19:10 github-actions[bot]

This issue was closed because it has been inactive for 5 days since being marked as stale.

github-actions[bot] avatar Oct 09 '24 19:10 github-actions[bot]

是不是 embedding 模型会莫名掉线,但是 LLM 还在?

zifeng-radxa avatar Oct 23 '24 03:10 zifeng-radxa