RedisGraph icon indicating copy to clipboard operation
RedisGraph copied to clipboard

algo.BFS supports aggregation on properties

Open DeoLeung opened this issue 3 years ago • 3 comments

Hi,

I'm trying to use RedisGraph to solve a problem like this: I have node(station) with label poi and property name and relation(connected stations) with label link and property distance I want to search for stations connected to current station within distance of 6000 currently I can do it with

  MATCH p=(a:poi {name: "station_a"})-[r:link*]-(b:poi) 
  UNWIND [i in relationships(p)|i.distance] as distance 
  WITH p, sum(distance) as total 
  WHERE total < 6000 
  RETURN [i in nodes(p)|i.name], total

which seems to do a full scan

I found bfs is implemented, so I wonder whether we can support termination condition with aggregation, something like

  MATCH (a:poi {name: "station_a"})-[r:link*]-(b:poi) 
 CALL ALGO.bfs(a, sum(r.distance) < 6000, null) 
  YIELD NODES AS n, EDGES AS e

also the doc for using algo.bfs is missing...I found it in the issue..

DeoLeung avatar Jan 14 '22 06:01 DeoLeung

Hi, this is the execution plan I'm getting for this query

127.0.0.1:6379> GRAPH.EXPLAIN g 'MATCH p=(a:poi {name: "station_a"})-[r:link*]-(b:poi) UNWIND [i in relationships(p)|i.distance] as distance WITH p, sum(distance) as total WHERE total < 6000 RETURN [i in nodes(p)|i.name], total'
1) "Results"
2) "    Project"
3) "        Filter"
4) "            Aggregate"
5) "                Unwind"
6) "                    Conditional Traverse | (b:poi)->(b:poi)"
7) "                        Conditional Variable Length Traverse | (a:poi)-[r:link*1..INF]->(b:poi)"
8) "                            Filter"
9) "                                Node By Label Scan | (a:poi)"

As you can see a Node By Label Scan is being used to locate the a node, let me suggest introducing an index over the poi, name label, attribute pair CREATE INDEX ON :poi(name) with this index in place locating all of the relevant a nodes should be much faster.

At the moment I don't believe we'll be adding a filter to our BFS algorithm.

swilly22 avatar Jan 20 '22 13:01 swilly22

Hi, this is the execution plan I'm getting for this query

127.0.0.1:6379> GRAPH.EXPLAIN g 'MATCH p=(a:poi {name: "station_a"})-[r:link*]-(b:poi) UNWIND [i in relationships(p)|i.distance] as distance WITH p, sum(distance) as total WHERE total < 6000 RETURN [i in nodes(p)|i.name], total'
1) "Results"
2) "    Project"
3) "        Filter"
4) "            Aggregate"
5) "                Unwind"
6) "                    Conditional Traverse | (b:poi)->(b:poi)"
7) "                        Conditional Variable Length Traverse | (a:poi)-[r:link*1..INF]->(b:poi)"
8) "                            Filter"
9) "                                Node By Label Scan | (a:poi)"

As you can see a Node By Label Scan is being used to locate the a node, let me suggest introducing an index over the poi, name label, attribute pair CREATE INDEX ON :poi(name) with this index in place locating all of the relevant a nodes should be much faster.

At the moment I don't believe we'll be adding a filter to our BFS algorithm.

yes I did the indexing to speed up :)

adding a filter could save the bfs from scanning the whole graph(or many hops), I think.

DeoLeung avatar Jan 24 '22 13:01 DeoLeung

you're right, but at the moment we don't have the ability to introduce filters to BFS, we do have that for our variable length traversal (DFS) in case that's a valid option for you

swilly22 avatar Jan 27 '22 07:01 swilly22