batch-import
batch-import copied to clipboard
Skip relationships with missing nodes instead of failing
When either the "start" or "end" node is a relationship does not exist the import fails with:
[WARNING]
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.neo4j.kernel.impl.nioneo.store.InvalidRecordException: NodeRecord[12972393] not in use
at org.neo4j.kernel.impl.nioneo.store.NodeStore.getRecord(NodeStore.java:252)
at org.neo4j.kernel.impl.nioneo.store.NodeStore.getRecord(NodeStore.java:125)
at org.neo4j.unsafe.batchinsert.BatchInserterImpl.getNodeRecord(BatchInserterImpl.java:1190)
at org.neo4j.unsafe.batchinsert.BatchInserterImpl.createRelationship(BatchInserterImpl.java:750)
at org.neo4j.batchimport.Importer.importRelationships(Importer.java:158)
at org.neo4j.batchimport.Importer.doImport(Importer.java:236)
at org.neo4j.batchimport.Importer.main(Importer.java:83)
... 6 more
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] An exception occured while executing the Java class. null
NodeRecord[12972393] not in use
Where 12972393 was the missing node ID.
Is it possible for a warning message to be printed and the relationship skipped instead of having the whole import fail? Even if this was not the default behavior, I think it would be a useful feature as a configuration option.
Actually it already does this for index lookups, I can add it for direct node id lookups too.
Yes, I think it would be a nice feature. Perhaps with a warning error being printed stating that the node is missing.
Thanks!
On Thu, Jan 16, 2014 at 4:45 AM, Michael Hunger [email protected]:
Actually it already does this for index lookups, I can add it for direct node id lookups too.
— Reply to this email directly or view it on GitHubhttps://github.com/jexp/batch-import/issues/65#issuecomment-32454482 .
The new neo4j-import tool supports skipping and logging unmet relationships.
See http://neo4j.com/docs/stable/import-tool.html http://neo4j.com/docs/stable/import-tool.html
Am 17.06.2015 um 22:32 schrieb Raymond Plante [email protected]:
Did this happen? Would be great feature when dealing with millions of nodes/relationships
— Reply to this email directly or view it on GitHub https://github.com/jexp/batch-import/issues/65#issuecomment-112940758.
@jexp Thanks. If you set --skip-bad-relationships it says they're logged up the the max indicated by --bad-tolerance. Do you know if this means the import will still continue, just no longer logging the bad ones it comes across?
ohh, this doesn't help with batch inserting... perhaps you could queue rels with missing nodes as, let's say, RelationshipPrecalculations, and check whether the nodes are still missing at the end of the import? that would make more sense than just throwing an error immediately I think