LightRAG icon indicating copy to clipboard operation
LightRAG copied to clipboard

[Question]: Graph storage performance - Postgres AGE is running slow

Open acsangamnerkar opened this issue 8 months ago • 2 comments

Do you need to ask a question?

  • [x] I have searched the existing question and discussions and this question is not already answered.
  • [x] I believe this is a legitimate question, not just a bug or feature request.

Your Question

I have been playing around with the Graph storage and have used Azure Postgres - Flex server with AGE and VECTOR extension to address all the storage needs of Lightrag. I have also created the indices as documented in Lightrag documentation but the performance is extremely slow in the retrieval process. Even for top k of 10, with a graph of 3500 nodes and 4500 edges, it is taking 3-5 minutes for the response. The knowledge graph process is not an issue.

With support for Mongo dropping as a solution for all storages, there is a need to improve the postgres graph performance to increase the adoption of the lightrag as a solution.

Additional Context

Here is the postgres query as the longest running query by Azure monitoring:

SELECT query_sql_text FROM query_store.query_texts_view WHERE query_text_id=4153752584585438422; query_sql_text

SELECT * FROM cypher('chunk_entity_relation', $$ + MATCH (n:base) + OPTIONAL MATCH (n)-[r]->(target:base) + RETURN collect(distinct n) AS n, collect(distinct r) AS r+ LIMIT 1000 + $$) AS (n agtype, r agtype)

acsangamnerkar avatar Apr 05 '25 21:04 acsangamnerkar

PostgreSQL graph performance issue is on the priority waiting list.

danielaskdd avatar Apr 08 '25 16:04 danielaskdd

PostgreSQL graph performance issue is on the priority waiting list.

Thanks. I see the new branch created form the other issue.

acsangamnerkar avatar Apr 08 '25 23:04 acsangamnerkar

Hello, Is there any progress on this? As this is being eagerly waited on. Or if someone can help me in this, will be much appreciated.

hamzafj5 avatar Apr 21 '25 08:04 hamzafj5

Hi, Any updates on this ? I was able to run queries easily within 3-4 seconds with PGVector + Age when I loaded 2 documents only. However when the size of my documents increased to about 30, my CPU usage was becoming 100% and I was getting no response. The KG size is about 10k Nodes and 20k edges.

khizarhussain19 avatar Apr 21 '25 08:04 khizarhussain19

PostgreSQL AGE performance issues have been resolved since v1.3.3. You can pull the latest version from the main branch and give it a try.

danielaskdd avatar Apr 21 '25 12:04 danielaskdd

However, for large-scale deployments, however, we recommend using Neo4j.

danielaskdd avatar Apr 21 '25 12:04 danielaskdd

PostgreSQL storage driver improved a lot, Please verify if the issue is resolved with the latest version.

danielaskdd avatar Jul 20 '25 02:07 danielaskdd