aclpwn.py icon indicating copy to clipboard operation
aclpwn.py copied to clipboard

Feature Request: Improve Dijkstra performance

Open machosec opened this issue 6 years ago • 3 comments

Hi @dirkjanm, The current implementation of the Dijkstra algorithm is not optimal, which is noticeable when dealing with larger graphs. As you mentioned in the wiki, the ShortestPath algorithm is faster than Dijkstra but the performance differences can be greatly reduced with few optimizations.

Neo4j stores relationship's properties values on disk which mean a lot of unnecessary IO overhead which dramatically slowing the process of graph traversal. Because the weight is constant for each ACL, it's better to calculate the cost based on the relationship type (name) which is in-memory (assuming the entire graph can be loaded to RAM and cache is warmed up). You can achieve this by tweaking the neo4j-graph-algorithms plugin to use constant weight values for each ACL instead of looking at the relationship's property. This will also spare the creation of additional properties.

machosec avatar Dec 05 '18 10:12 machosec

Thanks for thinking along with this :) Do you have an example of how and where this can be configured? Note that the weight is not constant for each ACL, it depends on the start and end node as well, see the code here: https://github.com/fox-it/aclpwn.py/blob/master/aclpwn/database.py#L78 I haven't come along something like this in the exposed cypher procedures. Also the default uses the REST API Dijkstra algorithm, which doesn't use the neo4j-graph-algorithms plugin.

dirkjanm avatar Dec 05 '18 11:12 dirkjanm

Because Neo4j builtin Dijsktra capabilities are disappointing, you would probably have to write a user procedure that implements Dijkstra with relationship name as cost. This issue made others create custom plugins who modified the implementation to their liking. This link is a good place to start: https://grokbase.com/t/gg/neo4j/156knxv03a/performance-difference-between-dijkstra-and-allshortestpath With user procedures, you could also evaluate the weight based on the start/end node and make it REST compatible.

machosec avatar Dec 05 '18 16:12 machosec

Hi, I wrote a code sample that uses a Map object for Dijkstra's CostEvaluator instead of relationship's distance property. Please note that it uses some of apoc's helpers function. If you're not familiar with Neo4j's Procedures, you may refer to link.

import java.util.Map;
import java.util.stream.Stream;

import org.neo4j.graphdb.*;
import org.neo4j.logging.Log;
import org.neo4j.procedure.*;
import org.neo4j.graphalgo.WeightedPath;
import org.neo4j.graphalgo.GraphAlgoFactory;
import org.neo4j.graphalgo.PathFinder;
import org.neo4j.helpers.collection.Pair;

import static apoc.util.Util.toDouble;
import apoc.path.RelationshipTypeAndDirections;
import apoc.result.WeightedPathResult;

public class Procedure {

    @Context
    public Log log;

    @Procedure(value = "util.findWeightedPath")
    @Description("CALL util.findPathWithConfig(startNode, endNode, 'KNOWS|<WORKS_WITH|IS_MANAGER_OF>', {'KNOWS': 5}, defaultValue, numberOfWantedResults) YIELD path," +
            " weight")
    public Stream<WeightedPathResult> findWeightedPath(
            @Name("startNode") Node startNode,
            @Name("endNode") Node endNode,
            @Name("relationshipTypesAndDirections") String relTypesAndDirs,
            @Name("relationshipsWeights") Map<String, Double> relsWeights,
            @Name(value = "defaultWeight", defaultValue = "NaN") double defaultWeight,
            @Name(value = "numberOfWantedPaths", defaultValue = "1") long numberOfWantedPaths) {
        PathFinder<WeightedPath> algo = GraphAlgoFactory.dijkstra(
                buildPathExpander(relTypesAndDirs),
                (relationship, direction) -> toDouble(relsWeights.getOrDefault(relationship, defaultWeight)),
                (int)numberOfWantedPaths
        );
        return WeightedPathResult.streamWeightedPathResult(startNode, endNode, algo);
    }
	
	public static Map<String, Double> relsWeights = new HashMap<String, Double>() {
		{
			put("KNOWS", 1.0);
			put("WORKS_WITH", 10.0);
			put("IS_MANAGER_OF", 100.0);
		}
    };

	@Procedure(value = "util.findPath")
	@Description("CALL util.findPath(startNode, endNode, numberOfWantedResults) YIELD path, weight")
	public Stream<WeightedPathResult> searchPath(@Name("startNode") Node startNode,
													   @Name("endNode") Node endNode,
													   @Name(value = "numberOfWantedPaths", defaultValue = "1") long numberOfWantedPaths) {
		return this.findPathWithConfig(startNode, endNode, '>', relsWeights, 1, numberOfWantedPaths);
	}


    private static PathExpander<Object> buildPathExpander(String relationshipsAndDirections) {
        PathExpanderBuilder builder = PathExpanderBuilder.empty();
        for (Pair<RelationshipType, Direction> pair : RelationshipTypeAndDirections
                .parse(relationshipsAndDirections)) {
            if (pair.first() == null) {
                if (pair.other() == null) {
                    builder = PathExpanderBuilder.allTypesAndDirections();
                } else {
                    builder = PathExpanderBuilder.allTypes(pair.other());
                }
            } else {
                if (pair.other() == null) {
                    builder = builder.add(pair.first());
                } else {
                    builder = builder.add(pair.first(), pair.other());
                }
            }
        }
        return builder.build();
    }
}

MrAnde7son avatar Dec 10 '18 10:12 MrAnde7son