[Question]: 【llamaindex multimodal propertygraphindex neo4j 】

Open GobinFan opened this issue 10 months ago • 1 comments

Question Validation

[x] I have searched both the documentation and discord for an answer.

Question

【llamaindex multimodal propertygraphindex neo4j 】How to implement a multimodal RAG based on propertygraphindex combined with neo4j for storing image vectors and text vectors, requiring the implementation of graph retrieval + vector retrieval ，thanks @jerryjliu

Feb 20 '25 10:02 GobinFan

Hey @GobinFan! I'm here to help you with any questions or issues you have. If you need assistance, feel free to ask!

To implement a multimodal RAG (Retrieval-Augmented Generation) system using a property graph index combined with Neo4j for storing image vectors and text vectors, you can follow these steps:

Initialize Neo4j Graph Store: Connect to your Neo4j database using Neo4jGraphStore.

neo4j_graph_store = Neo4jGraphStore(
    username=username,
    password=password,
    url=url,
    database=database,
)

Create Storage Context: Manage the storage of your graph data with StorageContext.

neo4j_storage_context = StorageContext.from_defaults(
    graph_store=neo4j_graph_store
)

Index Creation: Create a KnowledgeGraphIndex from your documents, including embeddings for vector retrieval.

neo4j_index = KnowledgeGraphIndex.from_documents(
    documents=docs,
    storage_context=neo4j_storage_context,
    max_triplets_per_chunk=10,
    include_embeddings=True,
)

Node Parsing: Use a SentenceSplitter to parse nodes from documents.

node_parser = SentenceSplitter(chunk_size=512)
nodes = node_parser(docs)

Vector Index: Create a VectorStoreIndex for vector retrieval.
```
vector_index = VectorStoreIndex(nodes=nodes)
```

Query Engine Setup: For a combination of vector and graph retrieval, use RAW_VECTOR_KG_COMBO.

if query_engine_type == Neo4jQueryEngineType.RAW_VECTOR_KG_COMBO:
    from llama_index.core.query_engine import RetrieverQueryEngine

    neo4j_vector_retriever = VectorIndexRetriever(index=vector_index)
    neo4j_kg_retriever = KGTableRetriever(
        index=neo4j_index, retriever_mode="keyword", include_text=False
    )
    neo4j_custom_retriever = CustomRetriever(
        neo4j_vector_retriever, neo4j_kg_retriever
    )

    neo4j_response_synthesizer = get_response_synthesizer(
        response_mode="tree_summarize"
    )

    self.query_engine = RetrieverQueryEngine(
        retriever=neo4j_custom_retriever,
        response_synthesizer=neo4j_response_synthesizer,
    )

Custom Retriever: Implement a CustomRetriever that combines vector and graph retrieval.

class CustomRetriever(BaseRetriever):
    def __init__(
        self,
        vector_retriever: VectorIndexRetriever,
        kg_retriever: KGTableRetriever,
        mode: str = "OR",
    ) -> None:
        self._vector_retriever = vector_retriever
        self._kg_retriever = kg_retriever
        if mode not in ("AND", "OR"):
            raise ValueError("Invalid mode.")
        self._mode = mode

    def _retrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]:
        vector_nodes = self._vector_retriever.retrieve(query_bundle)
        kg_nodes = self._kg_retriever.retrieve(query_bundle)

        vector_ids = {n.node.node_id for n in vector_nodes}
        kg_ids = {n.node.node_id for n in kg_nodes}

        combined_dict = {n.node.node_id: n for n in vector_nodes}
        combined_dict.update({n.node.node_id: n for n in kg_nodes})

        if self._mode == "AND":
            retrieve_ids = vector_ids.intersection(kg_ids)
        else:
            retrieve_ids = vector_ids.union(kg_ids)

        return [combined_dict[rid] for rid in retrieve_ids]

This setup allows you to perform both graph-based and vector-based retrievals, enabling a multimodal RAG system using Neo4j ^[1].

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other}

Feb 20 '25 10:02 dosubot[bot]