age icon indicating copy to clipboard operation
age copied to clipboard

AGE Cypher query matching variable length edges with [*] returns zero rows

Open ajit-satav opened this issue 1 year ago • 12 comments

While executing variable length query I am getting zero results

SELECT * FROM cypher('demo', $$ MATCH (a:Employee{name:'xxxx'})-[]-(b) RETURN a $$) as (a agtype); But if I do execute below query without variable length edges () it works and returns results

SELECT * FROM cypher('demo', $$ MATCH (a:Employee{name:'xxxx'})-[]-(b) RETURN a $$) as (a agtype);

Any suggestions why query with * is returning zero results?

ajit-satav avatar Nov 22 '23 12:11 ajit-satav

Hi @ajit-satav! Thank you very much for your question. Our team will look through it and provide an answer soon.

jimenasansav avatar Nov 22 '23 17:11 jimenasansav

@ajit-satav

Both with and without * produced same result on the latest master.

(pg-15.4) agedev=# SELECT *
FROM cypher('demo', $$
MATCH (a:Employee{name:'xxxx'})-[]-(b)
RETURN a
$$) as (a agtype);
                                           a                                           
---------------------------------------------------------------------------------------
 {"id": 1125899906842625, "label": "Employee", "properties": {"name": "xxxx"}}::vertex
(1 row)
(pg-15.4) agedev=# SELECT *
FROM cypher('demo', $$
MATCH (a:Employee{name:'xxxx'})-[*]-(b)
RETURN a
$$) as (a agtype);
                                           a                                           
---------------------------------------------------------------------------------------
 {"id": 1125899906842625, "label": "Employee", "properties": {"name": "xxxx"}}::vertex
(1 row)

I created the entities using-

SELECT * FROM cypher ('demo', $$ CREATE (a:Employee{name:'xxxx'})-[:rel]->(b) $$) as (a agtype);

rafsun42 avatar Nov 22 '23 17:11 rafsun42

it is working for me as well with limited number of vertices and edges. Issue appears when I loaded around 500k vertices and 900k edges and I have multi level nodes e.g employee --> org --> department --> division

Vertices and edges loaded using Load Graph functions

ajit-satav avatar Nov 23 '23 06:11 ajit-satav

@ajit-satav Is it possible to share your dataset?

rafsun42 avatar Nov 23 '23 16:11 rafsun42

Sharing dataset will be difficult. But I have observed that queries similar to below start breaking when we have larger number of edges e.g. over 2 million

SELECT * FROM cypher('demo', $$ MATCH (a:Employee{name:'xxxx'})-[*]-(b) RETURN a $$) as (a agtype);

ajit-satav avatar Nov 28 '23 13:11 ajit-satav

@ajit-satav Can you provide an output showing the two queries where one works and the other doesn't? Like what @rafsun42 showed above. Use the <> code option so that we can see that output here.

jrgemignani avatar Nov 28 '23 20:11 jrgemignani

@ajit-satav Also, please explain what you mean by "start breaking". Do you mean taking a long time? When you query for variable length edges with just [*], that means all lengths. When you use directionless edges -[*]- that means both directions. On a large graph, using both can be intensive depending on how connected the graph is.

Additionally, what version of AGE are you using?

jrgemignani avatar Nov 28 '23 20:11 jrgemignani

  1. AGE Version - 1.4.0

  2. Start breaking means it gives me zero results e..g

      SELECT * from 
       cypher('demo', $$
          MATCH (c1:components{name:'XXXX'})-[]-(c2:versions)
          return c2.name
      $$) as (v agtype);
    

    Results - No of c2.name returned - 2

      SELECT * from 
       cypher('demo', $$
          MATCH (c1:components{name:'XXXX'})-[*]-(c2:versions)
          return c2.name
      $$) as (v agtype);
    

    Results - No of c2.name returned - 0

ajit-satav avatar Nov 29 '23 06:11 ajit-satav

@ajit-satav Could you please include the results of the queries along with the queries? In particular, the full output of c1, c2, and the edge. Say, these queries instead -

SELECT * from 
   cypher('demo', $$
      MATCH (c1:components{name:'XXXX'})-[e]-(c2:versions)
      return c1,c2,e
  $$) as (c1 agtype, c2 agtype, e agtype);

SELECT * from 
   cypher('demo', $$
      MATCH (c1:components{name:'XXXX'})-[e*]-(c2:versions)
      return c1,c2,e
  $$) as (c1 agtype, c2 agtype, e agtype);

Just looking at the queries, and not the results (blank or otherwise), doesn't really give us much to go on.

jrgemignani avatar Nov 29 '23 15:11 jrgemignani

There was an issue with duplicate records inserted while creating edges. Now after correcting it and creating edges with create statement instead of csv import, query with MATCH (c1:components{name:'XXXX'})-[e*1]-(c2:versions) working fine. data-1701366514145.csv

But Query with MATCH (c1:components{name:'XXXX'})-[e*]-(c2:versions) taking forever to execute

I have index set on id column for all tables. Is it possible and recommended to set additional indexes on attributes within properties? if yes could you please suggest to how to achieve it?

ajit-satav avatar Dec 01 '23 15:12 ajit-satav

@ajit-satav For a VLE MATCH, indexes will have minimal impact on performance.

The reason your query is running long is due to specifying a directionless and unlimited length edge -[*]- If you want to improve performance, you will need to put restrictions on it. Even on small graphs, specifying a directionless edge along with unlimited length edges can take a long time. This is partly due to PostgreSQL and partly due to graph traversal in general.

Edges in AGE have a direction. By querying with a directionless edge, you can dramatically increase the number of paths through vertices and the path finding algorithms needs to find them all.

PostgreSQL, via the command line, will want to finish processing the entire query before outputting paged results. You can use the LIMIT command to force it to stop after a specified number of found values. However, I don't know if there is a way to force it to show results as it finds them.

jrgemignani avatar Dec 01 '23 20:12 jrgemignani

@ajit-satav Is this still a problem or is the issue resolved?

jrgemignani avatar Apr 17 '24 21:04 jrgemignani

This issue is stale because it has been open 60 days with no activity. Remove "Abondoned" label or comment or this will be closed in 14 days.

github-actions[bot] avatar Jun 17 '24 00:06 github-actions[bot]

This issue was closed because it has been stalled for further 14 days with no activity.

github-actions[bot] avatar Jul 01 '24 00:07 github-actions[bot]