graphiti Data insertion

trafficstars

Hello,

I am playing with the runner notebook and I just found that these lines take quite long time:

async def ingest_products_data(client: Graphiti):
    script_dir = Path.cwd().parent
    json_file_path = script_dir / 'data' / 'manybirds_products.json'

    with open(json_file_path) as file:
        products = json.load(file)['products']

    episodes: list[RawEpisode] = [
        RawEpisode(
            name=product.get('title', f'Product {i}'),
            content=str({k: v for k, v in product.items() if k != 'images'}),
            source_description='ManyBirds products',
            source=EpisodeType.json,
            reference_time=datetime.now(),
        )
        for i, product in enumerate(products)
    ]

    await client.add_episode_bulk(episodes)

Also after sometime it gets stuck: Screenshot 2024-09-19 at 16 34 08

I understand that under the hood is embedding each category of each product, right? Can you share more about what this is doing, please?

Thank you!

Sep 19 '24 14:09 alejandrods

Yeah! So we found that the large JSON files do end up taking a long time, but it should still complete everything within a few minutes. Under the hood we are using LLMs to extract entities and relations and then deduplicating those with existing entities and relations. We also attempt to extract the time that these relations are true at and whether or not new relations invalidate old ones.

I go a bit more in depth on some of our LLM calls in this blog post: https://blog.getzep.com/llm-rag-knowledge-graphs-faster-and-more-dynamic/

Sep 19 '24 16:09 prasmussen15

Thank you for the clarification!

I also received this error running the example notebook of "agent":

# Test the tool node
await tool_node.ainvoke({'messages': [await llm.ainvoke('What are the different types of shoes')]})

Sep 20 '24 08:09 alejandrods

Make sure you are on at least version 5.21 for neo4j as that is our minimum supported version.

Some history. Neo4j used to have an alternative syntax for determining shortest paths between nodes, but the ISO actually released a GQL standard for graphs in April 2024. Neo4j added the SHORTEST keyword as they work towards being GQL compliant

Sep 20 '24 13:09 prasmussen15

graphiti graphiti copied to clipboard

Data insertion

graphiti
graphiti copied to clipboard