KAG icon indicating copy to clipboard operation
KAG copied to clipboard

同一文档抽取,在不同项目,qa回答不一致

Open jerryHo123 opened this issue 1 year ago • 5 comments

在A项目,抽取过文档1、文档2、文档3,抽取后提问有关文档3里面的人物信息,回复是未查询到相关知识 在B项目,只抽取文档3,抽取后提问有关文档3里面的人物信息,回复是从文档3里提取的内容

这是哪个环节出了问题呢?

jerryHo123 avatar Dec 05 '24 09:12 jerryHo123

可以提供A项目的文档和问题吗?

thundax-lyp avatar Dec 05 '24 09:12 thundax-lyp

new.txt.txt 问题是:Do Yang Bin and Peng Huagang know each other? Have they participated in activities together?

jerryHo123 avatar Dec 05 '24 09:12 jerryHo123

new.txt.txt 问题是:Do Yang Bin and Peng Huagang know each other? Have they participated in activities together?

Could you tell us how to reproduce your experiments, including your llm conf、embedding model conf、schema、testset, etc.

caszkgui avatar Dec 06 '24 01:12 caszkgui

我是根据文档中新场景快速开始步骤建了一个测试项目https://openspg.yuque.com/ndx6g9/0.5/vbbdp80vg0xf5n3k 下图是llm conf 0c5d70792b99e6da5f6a364c12ffc51 下图是schema d33cb169efbf258af5d9c9b66fc4e01 下图是build/indexer.py image 项目B的数据集就是上面发的new.txt,项目A在抽取new.txt之前还抽取了另外两个文档(随便找两个都可以) 由于单位网络问题,无法复制,只能以拍照形式呈现,还请谅解

jerryHo123 avatar Dec 06 '24 02:12 jerryHo123

我是根据文档中新场景快速开始步骤建了一个测试项目https://openspg.yuque.com/ndx6g9/0.5/vbbdp80vg0xf5n3k 下图是llm conf 0c5d70792b99e6da5f6a364c12ffc51 下图是schema d33cb169efbf258af5d9c9b66fc4e01 下图是build/indexer.py image 项目B的数据集就是上面发的new.txt,项目A在抽取new.txt之前还抽取了另外两个文档(随便找两个都可以) 由于单位网络问题,无法复制,只能以拍照形式呈现,还请谅解

I can not reproduce your problem. KAG 0.7 has released and solved some bugs in retrieval, could you have another trail and tell us whether your problem fixed or not ?

caszkgui avatar Apr 18 '25 06:04 caszkgui

KAG 0.8 has been released at 2025-06-27. In which version, we have improved the management of private domain knowledge base indexing, incorporating multiple foundational index types such as Outline, Summary, KnowledgeUnit, AtomicQuery, Chunk, and Table. This supports developers in customizing indexes and synchronizing them with product interfaces. Users can select the most appropriate index type based on their specific scenarios, achieving a balance between construction costs and business outcomes.

caszkgui avatar Aug 15 '25 23:08 caszkgui