Serverless-Retrieval-Augmented-Generation-RAG-on-AWS icon indicating copy to clipboard operation
Serverless-Retrieval-Augmented-Generation-RAG-on-AWS copied to clipboard

Feature: Introducing Kúzu graph database for extra-planar relationships

Open giusedroid opened this issue 1 year ago • 2 comments

Kúzu is an embedded graph database. We want to explore graph-RAG capabilites to include more relevant information when retrieving semantically.

A good MVP for this would be mapping obvious relationships at ingestion that we cannot store semantically as vectors, for example

page -> next() : page
page -> previous() : page
page -> belongsToDocument() : document
page -> sectionStartsAt() : page
page -> sectionEndsAt() : page
document -> relatesToDocument() : document[]
document -> belongsToCollection() : document[] 
document -> abstract() : string

and at semantic retrieval use the relationships mapped for the retrieved vectors to provide additonal context or exclude other vectors from context placement if they are not related to the most relevant collection. Probably using them in the context of re-ranking.

giusedroid avatar Jun 15 '24 11:06 giusedroid

As a business user I want to make sure that I am only sourcing information from a collection of documents, so that the risk of hallucination by mixing non-retalted documents is minimized.

For example, with purely semantic/vectorial retrieval we've had hallucinations by confusing chunks that belong to different documents. A notable result is that the LLM combines knowledge from two different business entities "Amazon's new CEO is Barack Obama" because it loaded up in context two documents which were referring to business policies.

giusedroid avatar Sep 30 '24 14:09 giusedroid

RAG (Retrieval-Augmented Generation) system - expanded

luke-b avatar Oct 01 '24 17:10 luke-b