能否增加一个导入临时文档的功能?
传统的RAG方案除了支持知识库索引以外,应该还支持在线传入文档进行快速解析
我计划想要尝试开发这个功能,但我自己开发的功能可能有各种问题,如果开发者们近期有相同的计划,请告诉我,非常感谢qwq
我也有相同的开发计划,不过不是在线文档,是在进行一次deepResearch时候上传一些文档,目前的实现思路是在ds之前先将内容放到一个临时的rag里面,然后做deepSearch时候按一半ragSearch 一半webSearch进行ds。但是还有一些顾虑,放在内存做还是放在redis或者es、pg、mivlus什么数据库中做,向量模型选型也没做好。
I have the same development plan, but instead of online documents, I upload some documents during a deepResearch. The current implementation idea is to put the content into a temporary rag before the ds. Then when doing deepSearch, press half ragSearch and half webSearch to ds. But there are still some concerns, whether to do it in memory or redis or es, pg, mivlus database, vector model selection is not good.
It looks like you want to provide the RAG information by uploading the documents. As deer-flow support the RAG out of box, you can implement it by building a RAG with uploaded documents.
It looks like you want to provide the RAG information by uploading the documents. As deer-flow support the RAG out of box, you can implement it by building a RAG with uploaded documents.
Sorry, my earlier description may have caused confusion. This feature is technically unrelated to RAG.
What I'm hoping for is an immediate document processing function. The workflow would be:
- A user uploads a file (e.g., a PDF) from the frontend.
- The frontend/backend parses this file into plain text.
- The entire text content is directly placed into the LLM's context window for that conversation.
I'm not sure if this is a feature the core developers currently see value in, but it would be great to have!
There is a limitation of the context window, I'm not sure we can put the all the information of pdf into the context window. It could be much easier for us to just chain the related information for the deep research.
I think the solution that works right now is intra-session memory rag. The content length of the context is limited, so only some relevant content can be put into the context. The document when the user uploads is stored in the temporary knowledge base, and the recall is done during the search. And web search are incorporated into context. Can deerflow develop the capability to put in the pdf directly when starting the resaerch, and then search from the pdf as well when searching. willem, you are right, my current dilemma is the difficulty in selecting the technology for the temporary rag.