apoc running apoc.periodic.iterate and exceeding heap memory in neo4j 4.0.0 and APOC 4.0.0.3

Issue by jialudeng Thursday Feb 20, 2020 at 06:25 GMT Originally opened as https://github.com/neo4j-contrib/neo4j-apoc-procedures/issues/1418

Expected Behavior (Mandatory)

Creating 50 million new nodes and 50 million new relationships pointing towards existing 10 million nodes.

Actual Behavior (Mandatory)

Only 760k relationships and nodes were created and heap memory ran out. I later tested with the same config and queries using neo4j 3.4.14 and apoc 3.4.0.8. All 50 million nodes and relationships were successfully created in 1587348 ms.

How to Reproduce the Problem

I ran my apoc.periodic.iterate wrapped query in the desktop version 1.2.4 with neo4j version 4.0.0 and APOC version 4.0.0.3.

I increased heap size in neo4j.config as neo4j community suggested

dbms.memory.heap.initial_size=8G
dbms.memory.heap.max_size=8G
dbms.memory.pagecache.size=8G

I loaded the first group of Listing nodes from csv

CALL apoc.periodic.iterate("
CALL apoc.load.csv('file:///listings.csv',{
  mapping:{
    id: {type:'int'},
    beds: {type:'int'},
    price: {type: 'int'},
    score: {type: 'float'},
    reviews: {type: 'int'}
  }
}) YIELD map as row return row
","
CREATE (l:Listing) SET l = row
", {batchSize:10000, iterateList:true, parallel:true});

I created a unique constraint on the property id of Listing nodes

CREATE CONSTRAINT ON (listing:Listing) ASSERT listing.id IS UNIQUE

I then loaded the second group of Picture nodes from csv and created relationships, which maxed out the heap memory

CALL apoc.periodic.iterate("
CALL apoc.load.csv('file:///pictures.csv',{
  mapping:{
    id: {type:'int'},
    listing: {type:'int'}
  }
}) YIELD map as row RETURN row
"," 
  CREATE (p:Picture) SET p = row
  WITH p
  MATCH (l:Listing)
  WHERE p.listing = l.id
  CREATE (p)-[:PICTURE_OF]->(l)
", {batchSize:10000, parallel:false, iterateList:true});

Specifications (Mandatory)

Currently used versions

Versions

OS: macOS Catalina 10.15.3
Neo4j: 4.0.0
Neo4j-Apoc: 4.0.0.3

Sep 01 '22 11:09 neo-technology-build-agent

Comment by sarmbruster Thursday Feb 27, 2020 at 14:32 GMT

Can you provide the csv files you've used? So we could try to reproduce it. It would be also possible to share the files privately.

Sep 01 '22 11:09 neo-technology-build-agent

Comment by jialudeng Friday Feb 28, 2020 at 17:07 GMT

Yes absolutely! I just emailed you the links to my files on dropbox. Please let me know if there's any issue with accessing the files.

Sep 01 '22 11:09 neo-technology-build-agent

I am not sure if this was ever resolved with the aforementioned files, but since then Cypher has come a long way and CALL IN TRANSACTIONS is a Cypher replacement for APOCs apoc.periodic.iterate which will handle memory tracking etc, so anyone coming here should try that first :)

Mar 25 '25 13:03 gem-neo4j