graphiti
graphiti copied to clipboard
Data insertion
Hello,
I am playing with the runner notebook and I just found that these lines take quite long time:
async def ingest_products_data(client: Graphiti):
script_dir = Path.cwd().parent
json_file_path = script_dir / 'data' / 'manybirds_products.json'
with open(json_file_path) as file:
products = json.load(file)['products']
episodes: list[RawEpisode] = [
RawEpisode(
name=product.get('title', f'Product {i}'),
content=str({k: v for k, v in product.items() if k != 'images'}),
source_description='ManyBirds products',
source=EpisodeType.json,
reference_time=datetime.now(),
)
for i, product in enumerate(products)
]
await client.add_episode_bulk(episodes)
Also after sometime it gets stuck:
I understand that under the hood is embedding each category of each product, right? Can you share more about what this is doing, please?
Thank you!
Yeah! So we found that the large JSON files do end up taking a long time, but it should still complete everything within a few minutes. Under the hood we are using LLMs to extract entities and relations and then deduplicating those with existing entities and relations. We also attempt to extract the time that these relations are true at and whether or not new relations invalidate old ones.
I go a bit more in depth on some of our LLM calls in this blog post: https://blog.getzep.com/llm-rag-knowledge-graphs-faster-and-more-dynamic/
Thank you for the clarification!
I also received this error running the example notebook of "agent":
# Test the tool node
await tool_node.ainvoke({'messages': [await llm.ainvoke('What are the different types of shoes')]})
Make sure you are on at least version 5.21 for neo4j as that is our minimum supported version.
Some history. Neo4j used to have an alternative syntax for determining shortest paths between nodes, but the ISO actually released a GQL standard for graphs in April 2024. Neo4j added the SHORTEST keyword as they work towards being GQL compliant