batch-import icon indicating copy to clipboard operation
batch-import copied to clipboard

Skip relationships with missing nodes instead of failing

Open kylemarkwilliams opened this issue 12 years ago • 5 comments

When either the "start" or "end" node is a relationship does not exist the import fails with:

[WARNING]
java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297)
        at java.lang.Thread.run(Thread.java:722)
Caused by: org.neo4j.kernel.impl.nioneo.store.InvalidRecordException: NodeRecord[12972393] not in use
        at org.neo4j.kernel.impl.nioneo.store.NodeStore.getRecord(NodeStore.java:252)
        at org.neo4j.kernel.impl.nioneo.store.NodeStore.getRecord(NodeStore.java:125)
        at org.neo4j.unsafe.batchinsert.BatchInserterImpl.getNodeRecord(BatchInserterImpl.java:1190)
        at org.neo4j.unsafe.batchinsert.BatchInserterImpl.createRelationship(BatchInserterImpl.java:750)
        at org.neo4j.batchimport.Importer.importRelationships(Importer.java:158)
        at org.neo4j.batchimport.Importer.doImport(Importer.java:236)
        at org.neo4j.batchimport.Importer.main(Importer.java:83)
        ... 6 more
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] An exception occured while executing the Java class. null

NodeRecord[12972393] not in use

Where 12972393 was the missing node ID.

Is it possible for a warning message to be printed and the relationship skipped instead of having the whole import fail? Even if this was not the default behavior, I think it would be a useful feature as a configuration option.

kylemarkwilliams avatar Nov 27 '13 05:11 kylemarkwilliams

Actually it already does this for index lookups, I can add it for direct node id lookups too.

jexp avatar Jan 16 '14 09:01 jexp

Yes, I think it would be a nice feature. Perhaps with a warning error being printed stating that the node is missing.

Thanks!

On Thu, Jan 16, 2014 at 4:45 AM, Michael Hunger [email protected]:

Actually it already does this for index lookups, I can add it for direct node id lookups too.

— Reply to this email directly or view it on GitHubhttps://github.com/jexp/batch-import/issues/65#issuecomment-32454482 .

kylemarkwilliams avatar Jan 16 '14 21:01 kylemarkwilliams

The new neo4j-import tool supports skipping and logging unmet relationships.

See http://neo4j.com/docs/stable/import-tool.html http://neo4j.com/docs/stable/import-tool.html

Am 17.06.2015 um 22:32 schrieb Raymond Plante [email protected]:

Did this happen? Would be great feature when dealing with millions of nodes/relationships

— Reply to this email directly or view it on GitHub https://github.com/jexp/batch-import/issues/65#issuecomment-112940758.

jexp avatar Jun 17 '15 22:06 jexp

@jexp Thanks. If you set --skip-bad-relationships it says they're logged up the the max indicated by --bad-tolerance. Do you know if this means the import will still continue, just no longer logging the bad ones it comes across?

raymondjplante avatar Jun 18 '15 17:06 raymondjplante

ohh, this doesn't help with batch inserting... perhaps you could queue rels with missing nodes as, let's say, RelationshipPrecalculations, and check whether the nodes are still missing at the end of the import? that would make more sense than just throwing an error immediately I think

ehx-v1 avatar Apr 13 '16 12:04 ehx-v1