Langchain-Chatchat [BUG] 提问公式相关问题大概率爆显存

问题描述 / Problem Description

比如“请给出一个高等数学的公式”，提问十次有五六次爆显存。

给了5张3090 （24GB）运行docker版本。 chatglm-6b模型和word2vec已经手动下载到了本地。其他设置全部默认

预期的结果 / Expected Result 描述应该出现的结果 / Describe the expected result.

实际结果 / Actual Result 描述实际发生的结果 / Describe the actual result.

环境信息 / Environment Information

langchain-ChatGLM 版本/commit 号： e8b2ddea51e7813682b422e3bbb6c54e200646d9
是否使用 Docker 部署（是/否）：是 / Is Docker deployment used (yes/no): yes
使用的模型（ChatGLM-6B / ClueAI/ChatYuan-large-v2 等）：本地ChatGLM-6B
使用的 Embedding 模型（GanymedeNil/text2vec-large-chinese 等）：本地 GanymedeNil/text2vec-large-chinese
操作系统及版本 / Operating system and version: Ubuntu 18.04
Python 版本 / Python version:
其他相关环境信息 / Other relevant environment information:

附加信息 / Additional Information torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 54.96 GiB (GPU 0; 23.69 GiB total capacity; 5.70 GiB already allocated; 14.90 GiB free; 7.79 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

May 09 '23 17:05 KyonCN

请问是在LLM对话模式下还是知识库问答模式下

May 09 '23 23:05 imClumsyPanda

请问是在LLM对话模式下还是知识库问答模式下

LLM对话和知识库问答模式下都会有概率触发这个问题

May 10 '23 05:05 KyonCN

一般情况下，出现这类问题是因为给llm输入的内容较多，建议检查一下文档load并分句完成后每段的最长长度，如果超出sentence_chunk_size，甚至超过chunk_size，可能会导致这一问题，推荐的解决办法是针对文档特点，创建一个textsplitter类，实现更好的分句效果。

May 10 '23 05:05 imClumsyPanda

一般情况下，出现这类问题是因为给llm输入的内容较多，建议检查一下文档load并分句完成后每段的最长长度，如果超出sentence_chunk_size，甚至超过chunk_size，可能会导致这一问题，推荐的解决办法是针对文档特点，创建一个textsplitter类，实现更好的分句效果。

忘记说了，测试的时候还没有向知识库添加任何文档，所以感觉很奇怪。

May 10 '23 05:05 KyonCN

Langchain-Chatchat Langchain-Chatchat copied to clipboard

[BUG] 提问公式相关问题大概率爆显存

Langchain-Chatchat
Langchain-Chatchat copied to clipboard