neo4j-graphrag-python icon indicating copy to clipboard operation
neo4j-graphrag-python copied to clipboard

Question: Any way to attach `PipelineResult.run_id` to the nodes and relationships that were touched by the SimpleKGPipeline run?

Open nickjfrench opened this issue 10 months ago • 3 comments

I want to be able to MATCH what nodes and relationships were added or edited by a run. I know SimpleKGPipeline returns a PipelineResult that contains the number of nodes and the run_id, but it doesn't seem to attach itself to any nodes or relationships within the graph DB. There is an id property added, but that seems to come from the chunk's id.

Am I missing something that already exists either within the GraphRAG package or Neo4j GraphDB, or is there a easy way to attach it to nodes and rels myself? If I attach it myself, I imagine the property would need to be an array, as nodes will be edited by multiple runs.

nickjfrench avatar Feb 27 '25 16:02 nickjfrench

Hi @nickjfrench ,

This is not possible at the moment, the internal pipeline components, especially the KGWriter, do not know about this run_id (which is used to store and access components results during pipeline execution).

I understand the use case though, so I'll keep this issue open until we can provide a real solution.

stellasia avatar Feb 27 '25 17:02 stellasia

Just wanted to chime in — this is very crucial to my use case as well. I’m working on enriching an existing KG, and being able to identify which nodes and relationships were touched by a specific SimpleKGPipeline run (via a run_id or similar) is crucial.

Looking forward to any updates on this.

oskrocha avatar Mar 10 '25 13:03 oskrocha

Hi,

New in release 1.6.1: it is now possible to access the run_id from within the component. To do so, you must implement the run_with_context method instead of run (note that these two methods will eventually be merged when the API is stabilized). The run_id is attached to the RunContext that's passed as a first argument to this method. Here is an example:

class MyComponent(Component):

    async def run_with_context(
        self,
        context_: RunContext,
        numbers: list[int],
        **kwargs: Any,
    ) -> ComponentResult:

        run_id = context_.run_id

        return ComponentResult(run_id=run_id)

In order to attach it to all created nodes, at the moment you still need to create your own extractor. We're discussing this point internally, I'll come back to you as soon as possible.

stellasia avatar Mar 31 '25 07:03 stellasia

After internal discussion, it was decided not to implement this solution for now. As discussed above, you can create your own component to make it happen if this is something important to you.

stellasia avatar Oct 28 '25 14:10 stellasia