BloodHound icon indicating copy to clipboard operation
BloodHound copied to clipboard

Import not working with neo4j 5.1

Open gravok opened this issue 2 years ago • 13 comments

Describe the bug Pulled newest neo4j docker image from docker hub. Because the database format changed I deleted the old database and tried to import my json files. The upload is very slow or stalls completly.

To Reproduce Steps to reproduce the behavior:

  1. Pull neo4j 5.1 from docker hub
  2. Import json files
  3. Look at upload progession compared with neo4j 4.4.12

Expected behavior The import should be the same as for neo4j 4.4.12

gravok avatar Nov 03 '22 11:11 gravok

Can confirm, using neo4j 5.1 tanks performance, compared to 4.4.X.

xathon avatar Nov 04 '22 12:11 xathon

I have been fighting issues with import all week, I've tried only importing specific json. Even the smallest files, take minutes to import with 5.1. BloodHound Read the Docs should probably specify supported version.

lucasni2 avatar Nov 04 '22 21:11 lucasni2

Can confirm here as well. neo4j 4.4.12 and JDK 11 work great.

ag-michael avatar Nov 07 '22 01:11 ag-michael

Hi all,

Thanks for opening this issue and for the comments here. It seems there have been performance regression issues with Neo4j's latest major release, Neo4j 5:

https://github.com/neo4j/neo4j/issues/12977

While we work on a fix, please stick to Neo4j version 4.4.13, which you can get here: https://neo4j.com/download-center/#community

I updated the installation instructions in ReadTheDocs to repeat this message:

https://bloodhound.readthedocs.io/en/latest/installation/windows.html https://bloodhound.readthedocs.io/en/latest/installation/osx.html https://bloodhound.readthedocs.io/en/latest/installation/linux.html

Andy

andyrobbins avatar Nov 09 '22 20:11 andyrobbins

@andyrobbins would you be able to identify the statement(s) in bloodhound import that are slow?

Not sure if you already have timings in your imports, otherwise you could also look at the query logs of an EE instance.

jexp avatar Nov 10 '22 12:11 jexp

@andyrobbins ping :)

jexp avatar Nov 11 '22 16:11 jexp

Based on ingest being slow, the problem queries would be the following:

UNWIND $props AS prop MERGE (n:Base {objectid: prop.source}) SET n:{0} MERGE (m:Base {objectid: prop.target}) SET m:{1} MERGE (n)-[r:{2} {3}]->(m)

UNWIND $props AS prop MERGE (n:AZBase {objectid: prop.source}) SET n:{0} MERGE (m:AZBase {objectid: prop.target}) SET m:{1} MERGE (n)-[r:{2} {3}]->(m)

UNWIND $props AS prop MERGE (n:Base {objectid:prop.objectid}) SET n:{} SET n += prop.map

Couldn't tell you which of these queries is actually the issue since I dont have time to dig into it right now. Note that the vars in curly braces are being string substituted at runtime, but should be fairly obvious what each one corresponds too.

rvazarkar avatar Nov 11 '22 17:11 rvazarkar

Do you know if bloodhound creates the correct constraints for :Label(objectid) for each of those?

that sounds like the main culprit

jexp avatar Nov 11 '22 17:11 jexp

https://github.com/BloodHoundAD/BloodHound/blob/master/src/js/utils.js#L212

Should be yes

rvazarkar avatar Nov 11 '22 17:11 rvazarkar

@rvazarkar Those index creations use deprecated methods. There is no such db.createIndex or db.createUniquePropertyConstraint procedures anymore.

ikwattro avatar Nov 11 '22 19:11 ikwattro

depending on which versions you want to be compatible with you might need to pick the 4.x or 5.x variants

Most important docs for constraints

general syntax

create constraint base_objectid if not exists for (b:Base) require (b.objectid) is unique

https://neo4j.com/docs/cypher-manual/current/constraints/

Docs for indexes https://neo4j.com/docs/cypher-manual/current/indexes-for-search-performance/

jexp avatar Nov 12 '22 08:11 jexp

@rvazarkar @andyrobbins

Did you try to use the new constraint syntax to fix the issue for neo4j 5.x ?

You also don't want to catch Exceptions and ignore them, so the errors are invisble. https://github.com/BloodHoundAD/BloodHound/blob/master/src/js/utils.js#L255-L270

jexp avatar Mar 27 '23 16:03 jexp

Can confirm, using neo4j 5.1 tanks performance, compared to 4.4.X.

is the problem still present with the latest version neo4j 5.10 ?

Dramelac avatar Jul 21 '23 14:07 Dramelac