[Question]: Graph storage performance - Postgres AGE is running slow
Do you need to ask a question?
- [x] I have searched the existing question and discussions and this question is not already answered.
- [x] I believe this is a legitimate question, not just a bug or feature request.
Your Question
I have been playing around with the Graph storage and have used Azure Postgres - Flex server with AGE and VECTOR extension to address all the storage needs of Lightrag. I have also created the indices as documented in Lightrag documentation but the performance is extremely slow in the retrieval process. Even for top k of 10, with a graph of 3500 nodes and 4500 edges, it is taking 3-5 minutes for the response. The knowledge graph process is not an issue.
With support for Mongo dropping as a solution for all storages, there is a need to improve the postgres graph performance to increase the adoption of the lightrag as a solution.
Additional Context
Here is the postgres query as the longest running query by Azure monitoring:
SELECT query_sql_text FROM query_store.query_texts_view WHERE query_text_id=4153752584585438422; query_sql_text
SELECT * FROM cypher('chunk_entity_relation', $$ + MATCH (n:base) + OPTIONAL MATCH (n)-[r]->(target:base) + RETURN collect(distinct n) AS n, collect(distinct r) AS r+ LIMIT 1000 + $$) AS (n agtype, r agtype)
PostgreSQL graph performance issue is on the priority waiting list.
PostgreSQL graph performance issue is on the priority waiting list.
Thanks. I see the new branch created form the other issue.
Hello, Is there any progress on this? As this is being eagerly waited on. Or if someone can help me in this, will be much appreciated.
Hi, Any updates on this ? I was able to run queries easily within 3-4 seconds with PGVector + Age when I loaded 2 documents only. However when the size of my documents increased to about 30, my CPU usage was becoming 100% and I was getting no response. The KG size is about 10k Nodes and 20k edges.
PostgreSQL AGE performance issues have been resolved since v1.3.3. You can pull the latest version from the main branch and give it a try.
However, for large-scale deployments, however, we recommend using Neo4j.
PostgreSQL storage driver improved a lot, Please verify if the issue is resolved with the latest version.