Langchain-Chatchat icon indicating copy to clipboard operation
Langchain-Chatchat copied to clipboard

PDF文件采用RapidOCRPDFLoader解析然后嵌入向量库速度很慢 / 知识库问答大模型兜底配置

Open guodastanson opened this issue 1 year ago • 5 comments

问题1:PDF解析用的python工具RapidOCRPDFLoader,我48核的服务器,解析起来都是占满了,速率一般20-40s/it,求问这个过程能否调GPU进行加速解析。 问题2:知识库问答没检索到相关内容时怎么让模型直接回答预设的模板,而不是通过它自己的知识乱答?之前用星火的api是在接口上有调用的兜底选项的,求问langchain-chat这里是否可以配置,还是说要自己封装或者修改代码

guodastanson avatar Mar 12 '24 09:03 guodastanson

嘿,@guodastanson,又见面了!希望一切都好。

关于您的第一个问题,Langchain-Chatchat的RapidOCRPDFLoader工具确实支持使用GPU加速解析过程。在调用get_ocr函数时,确保use_cuda参数设置为True。这是通过RapidOCR构造函数中的det_use_cuda=use_cuda, cls_use_cuda=use_cuda, rec_use_cuda=use_cuda参数来设置的,这些参数基于use_cuda参数。如果use_cudaTrue,则工具将尝试使用CUDA进行GPU加速,这应该可以加速解析过程。您可以参考document_loaders/ocr.py中的代码。

关于您的第二个问题,提供的上下文中并未包含有关处理知识库问答的响应或配置此类响应的信息。因此,基于给定的上下文,无法提供在这些场景中配置系统以使用预设模板响应的解决方案或方法。如果有更多的上下文信息,可能会有助于找到解决方案。

另外,我注意到在Langchain-Chatchat的问题跟踪中,有一个类似的已关闭问题[FEATURE] RapidOCRPDFLoader GPU加速 / Accelerate RapidOCRPDFLoader processing with GPU。开发者表示他们已经意识到需要GPU加速,并正在考虑在未来实现它。这可能意味着在未来的版本中,这个问题可能会得到解决。

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] avatar Mar 12 '24 09:03 dosubot[bot]

第二个问题详细描述:使用langchain-chat选取知识库问答,模型选择chatGLM3-6b(其他模型其实也一样),在prompt中描述为当知识库中未搜索到相关内容时请直接回答“暂未找到相关内容”,不使用模型自己生产答案。实际操作时提问,模型未从milvus向量库中的知识库找到匹配内容时,模型仍然会自己回答一段内容。请问如何在此种情况让模型不要做出自行回答的动作?

guodastanson avatar Mar 13 '24 01:03 guodastanson

微信图片_20240313100011

第三个问题: 模型回答的时候有时候会给回答加<回答></回答>的标签,这种如何去除? 第四个问题:在你提一个问题时,模型有时候会自己重复提问回答,生成 ,而且有时候Round都是一模一样的内容,这种又是什么原因呢

guodastanson avatar Mar 13 '24 02:03 guodastanson

@dosubot

guodastanson avatar Mar 14 '24 02:03 guodastanson

Based on the issues and code snippets you've provided, here are some possible solutions to your problems:

  1. RapidOCRPDFLoader's slow parsing speed on a 48-core server:

    • There is a closed issue [FEATURE] RapidOCRPDFLoader GPU加速 / Accelerate RapidOCRPDFLoader processing with GPU that discusses the need for GPU acceleration in the RapidOCRPDFLoader tool. The developers are aware of this need and are considering implementing it in the future. Unfortunately, without more specific details about the tool's implementation, it's hard to provide a concrete solution for optimizing its performance on a 48-core server.
  2. Configuring the chatGLM3-6b model to respond with "No relevant content found" when no matching content is found in the Milvus vector library's knowledge base:

    • This would typically involve checking the response from the Milvus vector library and then crafting a response if no relevant content is found. However, without more specific details about how the model interacts with the Milvus vector library and handles the responses, it's hard to provide a concrete solution.
  3. Removing the <answer></answer> tags from the model's responses:

    • If these tags are part of the responses, you can use string manipulation or regular expressions to remove them. Here's an example of how you can do this:
      def clean_response(self, response: str) -> str:
          # Using regular expressions to remove <answer></answer> tags
          import re
          cleaned_response = re.sub(r'<answer>|</answer>', '', response)
          return cleaned_response
      
      You would then call this method on the response text before returning it.
  4. Preventing the model from repeating the same content in its responses:

    • This could be achieved by modifying the StructuredGLM3ChatAgent class to include a history of responses and checking against this history before finalizing a response. However, without more specific details about how the model generates and stores its responses, it's hard to provide a concrete solution.

I hope this helps! If you have more specific details about the implementation of the RapidOCRPDFLoader tool, the interaction between the chatGLM3-6b model and the Milvus vector library, or the generation and storage of the model's responses, I would be able to provide more detailed solutions.

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] avatar Mar 14 '24 02:03 dosubot[bot]

安装GPU版本就行

zRzRzRzRzRzRzR avatar May 07 '24 02:05 zRzRzRzRzRzRzR