apoc apoc.merge.node returns java.lang.NullPointerException on "name"

apoc.merge.node returns java.lang.NullPointerException on "name" when there is no "name" property in the data

here is the bug reproduced in Neo4j browser, with this test file (remove txt extension) followers.test.json.txt

call apoc.load.json("followers.test.json") yield value
unwind value["status"] as twJson
merge (u:TwUser {id:"826009201600233474"}) with twJson,u
    call apoc.merge.node([twJson.label,twJson.labelOrigine],twJson) yield node
    with twJson,u,node  where not exists((u)-[:TWEETED]->(node))
    create (u)-[:TWEETED {twExportDate:twJson.twExportDate, importDateTime:localdatetime({ timezone: 'Europe/Paris' })}]->(node)

debug.log says this, but not at the time the above query was run

2023-08-13 08:08:20.500+0000 WARN  [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=6244, gcTime=0, gcCount=0}
2023-08-13 08:08:20.704+0000 WARN  [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=135732722, gcTime=0, gcCount=0}
2023-08-13 08:08:22.168+0000 WARN  [o.n.k.i.c.VmPauseMonitorComponent] Detected VM stop-the-world pause: {pauseTime=1264, gcTime=0, gcCount=0}
2023-08-13 08:08:22.179+0000 WARN  [a.c.ClusterHeartbeat] Cluster Node [akka://cc-discovery-actor-system@localhost:5000] - Scheduled sending of heartbeat was delayed. Previous heartbeat was sent [135739724] ms ago, expected interval is [1000] ms. This may cause failure detection to mark members as unreachable. The reason can be thread starvation, CPU overload, or GC.
2023-08-13 09:11:53.549+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] [neo4j/229ff1a9] Checkpoint triggered by "Scheduled checkpoint for every 15 minutes threshold" @ txId: 36 checkpoint started...
2023-08-13 09:11:53.627+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] [neo4j/229ff1a9] Checkpoint triggered by "Scheduled checkpoint for every 15 minutes threshold" @ txId: 36 checkpoint completed in 76ms. Checkpoint flushed 49 pages (0% of total available pages), in 44 IOs. Checkpoint performed with IO limit: 600, paused in total 0 times( 0 millis).
2023-08-13 09:11:53.627+0000 INFO  [o.n.k.i.t.l.p.LogPruningImpl] [neo4j/229ff1a9] No log version pruned. The strategy used was '2 days'. 
2023-08-13 09:11:57.357+0000 INFO  [o.n.c.i.c.CypherQueryCaches] [neo4j/229ff1a9] Discarded stale query from the query cache after 266 seconds. Reason: IndexPropertyExistsSelectivity(IndexDescriptor(Text,Node(LabelId(3)),List(PropertyKeyId(10)),Set(),NONE,DoNotGetValue,None,false)) changed from 0.0 to 0.1, which is a divergence of 1.0 which is greater than threshold 0.7033643993854399. Query id: 4514.

Versions

OS: Windows 10
Neo4j: desktop 1.5.8 with 5.9.0 db
Neo4j-Apoc: 5.9

Aug 13 '23 09:08 v2belleville

Thanks for reporting this. We will come back to you once we had a look.

Aug 15 '23 08:08 nadja-muller

Hi.

Thanks again for reporting this. The problem in your query is that some of the labels passed into apoc.merge.nodes are null. You could add an additional check similar to this snippet here:

WITH {label: ''a', labelOrigin: 'b'} as twJson
WHERE twJson.label IS NOT NULL and twJson.labelOrigin IS NOT NULL
call apoc.merge.node([twJson.label, twJson.labelOrigin],twJson) yield node
RETURN node

While this behavior won't change, we will fix the code to return a proper error message. Let us know if this does not solve your issue.

Oct 03 '23 16:10 nadja-muller

thanks Nadja, I'd over simplified the test json, but in the real code, it was indeed because twJson.label and twJson.labelOrigin were null that the whole thing broke. a message saying labels have null value will definitely be more helpful !

Oct 13 '23 17:10 v2belleville

This was fixed :) https://github.com/neo4j/apoc/pull/503

Jan 02 '25 08:01 gem-neo4j