apoc icon indicating copy to clipboard operation
apoc copied to clipboard

running apoc.periodic.iterate and exceeding heap memory in neo4j 4.0.0 and APOC 4.0.0.3

Open neo-technology-build-agent opened this issue 3 years ago • 2 comments

Issue by jialudeng Thursday Feb 20, 2020 at 06:25 GMT Originally opened as https://github.com/neo4j-contrib/neo4j-apoc-procedures/issues/1418


Expected Behavior (Mandatory)

Creating 50 million new nodes and 50 million new relationships pointing towards existing 10 million nodes.

Actual Behavior (Mandatory)

Only 760k relationships and nodes were created and heap memory ran out. I later tested with the same config and queries using neo4j 3.4.14 and apoc 3.4.0.8. All 50 million nodes and relationships were successfully created in 1587348 ms.

How to Reproduce the Problem

I ran my apoc.periodic.iterate wrapped query in the desktop version 1.2.4 with neo4j version 4.0.0 and APOC version 4.0.0.3.

I increased heap size in neo4j.config as neo4j community suggested

dbms.memory.heap.initial_size=8G
dbms.memory.heap.max_size=8G
dbms.memory.pagecache.size=8G

I loaded the first group of Listing nodes from csv

CALL apoc.periodic.iterate("
CALL apoc.load.csv('file:///listings.csv',{
  mapping:{
    id: {type:'int'},
    beds: {type:'int'},
    price: {type: 'int'},
    score: {type: 'float'},
    reviews: {type: 'int'}
  }
}) YIELD map as row return row
","
CREATE (l:Listing) SET l = row
", {batchSize:10000, iterateList:true, parallel:true});

I created a unique constraint on the property id of Listing nodes

CREATE CONSTRAINT ON (listing:Listing) ASSERT listing.id IS UNIQUE

I then loaded the second group of Picture nodes from csv and created relationships, which maxed out the heap memory

CALL apoc.periodic.iterate("
CALL apoc.load.csv('file:///pictures.csv',{
  mapping:{
    id: {type:'int'},
    listing: {type:'int'}
  }
}) YIELD map as row RETURN row
"," 
  CREATE (p:Picture) SET p = row
  WITH p
  MATCH (l:Listing)
  WHERE p.listing = l.id
  CREATE (p)-[:PICTURE_OF]->(l)
", {batchSize:10000, parallel:false, iterateList:true});

Specifications (Mandatory)

Currently used versions

Versions

  • OS: macOS Catalina 10.15.3
  • Neo4j: 4.0.0
  • Neo4j-Apoc: 4.0.0.3

Comment by sarmbruster Thursday Feb 27, 2020 at 14:32 GMT


Can you provide the csv files you've used? So we could try to reproduce it. It would be also possible to share the files privately.

Comment by jialudeng Friday Feb 28, 2020 at 17:07 GMT


Yes absolutely! I just emailed you the links to my files on dropbox. Please let me know if there's any issue with accessing the files.

I am not sure if this was ever resolved with the aforementioned files, but since then Cypher has come a long way and CALL IN TRANSACTIONS is a Cypher replacement for APOCs apoc.periodic.iterate which will handle memory tracking etc, so anyone coming here should try that first :)

gem-neo4j avatar Mar 25 '25 13:03 gem-neo4j