同一文档抽取,在不同项目,qa回答不一致
在A项目,抽取过文档1、文档2、文档3,抽取后提问有关文档3里面的人物信息,回复是未查询到相关知识 在B项目,只抽取文档3,抽取后提问有关文档3里面的人物信息,回复是从文档3里提取的内容
这是哪个环节出了问题呢?
可以提供A项目的文档和问题吗?
new.txt.txt 问题是:Do Yang Bin and Peng Huagang know each other? Have they participated in activities together?
new.txt.txt 问题是:Do Yang Bin and Peng Huagang know each other? Have they participated in activities together?
Could you tell us how to reproduce your experiments, including your llm conf、embedding model conf、schema、testset, etc.
我是根据文档中新场景快速开始步骤建了一个测试项目https://openspg.yuque.com/ndx6g9/0.5/vbbdp80vg0xf5n3k
下图是llm conf
下图是schema
下图是build/indexer.py
项目B的数据集就是上面发的new.txt,项目A在抽取new.txt之前还抽取了另外两个文档(随便找两个都可以)
由于单位网络问题,无法复制,只能以拍照形式呈现,还请谅解
我是根据文档中新场景快速开始步骤建了一个测试项目https://openspg.yuque.com/ndx6g9/0.5/vbbdp80vg0xf5n3k 下图是llm conf
下图是schema
下图是build/indexer.py
项目B的数据集就是上面发的new.txt,项目A在抽取new.txt之前还抽取了另外两个文档(随便找两个都可以) 由于单位网络问题,无法复制,只能以拍照形式呈现,还请谅解
I can not reproduce your problem. KAG 0.7 has released and solved some bugs in retrieval, could you have another trail and tell us whether your problem fixed or not ?
KAG 0.8 has been released at 2025-06-27. In which version, we have improved the management of private domain knowledge base indexing, incorporating multiple foundational index types such as Outline, Summary, KnowledgeUnit, AtomicQuery, Chunk, and Table. This supports developers in customizing indexes and synchronizing them with product interfaces. Users can select the most appropriate index type based on their specific scenarios, achieving a balance between construction costs and business outcomes.