combo of apoc.periodic.iterate/apoc.refactor.mergeNodes runs forever
Issue by sarmbruster
Friday Feb 14, 2020 at 17:07 GMT
Originally opened as https://github.com/neo4j-contrib/neo4j-apoc-procedures/issues/1408
to reproduce: run Neo4j 4.0.0 + APOC 4.0.0.2 and setup a test dataset:
unwind range(1,1000) as x
create (p:Person{name:"person_" + x}) -[:OWNS]->(o:Org{name: "org_" + x, id:x})
This statement will never terminate:
call apoc.periodic.iterate(
"match (o:Org) with collect(o) as orgs
unwind range(0,99) as batch
return orgs[batch*10..batch*10+10] as nodes
",
"CALL apoc.refactor.mergeNodes(nodes, { properties: {name: 'combine', `.*` : 'discard'}, mergeRels: true}) YIELD node RETURN node.name",
{batchSize:1})
Comment by sarmbruster
Sunday Feb 23, 2020 at 10:58 GMT
Reason for getting stalled is that apoc.refactor.mergeNodes does internally grab write locks upfront without rebinding these nodes to the current transaction beforehand.
There's a workaround: prefix the second statement with CALL apoc.nodes.get(nodes) YIELD node WITH collect(node) AS nodes2 and use nodes2 for mergeNodes
Will close as an answer + workaround has been supplied.
We also recommend now using Cypher's CALL IN TRANSACTIONS over apoc.periodic.iterate :)