arcadedb
arcadedb copied to clipboard
Importer: url and id type problems
ArcadeDB Version:
ArcadeDB Server v24.2.1 (build 5c4448730af1f607dec0b7dfb2e8dffc6b33b3cb/1713169925164/main)
OS and JDK Version:
Running on Mac OS X 12.7.4 - OpenJDK 64-Bit Server VM 17.0.10 (Homebrew)
I found two problems with the importer particularly via SQL, ie IMPORT DATABASE:
- [ ] If importing a graph the
urlfield seems to be required even though averticesand/or aedgesfile is supplied as setting, which means whatever is placed in theurlfield is parsed and inserted asDocumentseven though unnecessary. - [ ] Edges are only linked if the id type of the vertices file matches the type of the
fromandtofields of the edge file. Now by default the vertextypeIdTypeisStringwhile it seems ids in edge files areLongintegers by default but are unconfigurable.
Expected behavior
This:
IMPORT DATABASE WITH vertices="file://vertices.csv", verticesFileType=csv, typeIdProperty=Id, edges="file://edges.csv", edgesFileType=csv, edgeFromField="From", edgeToField="To"
should work.
Actual behavior
Internal error Cannot invoke "com.arcadedb.query.sql.parser.Url.toString(java.util.Map, StringBuilder)" because "this.url" is null
Error on command execution (PostCommandHandler)
java.lang.NullPointerException: Cannot invoke "com.arcadedb.query.sql.parser.Url.toString(java.util.Map, StringBuilder)" because "this.url" is null
at com.arcadedb.query.sql.parser.ImportDatabaseStatement.toString(ImportDatabaseStatement.java:86)
at com.arcadedb.query.sql.parser.SimpleNode.toString(SimpleNode.java:105)
at com.arcadedb.query.sql.executor.SingleOpExecutionPlan.prettyPrint(SingleOpExecutionPlan.java:106)
at com.arcadedb.server.http.handler.PostCommandHandler.lambda$execute$0(PostCommandHandler.java:117)
at java.base/java.util.Optional.ifPresent(Optional.java:178)
at com.arcadedb.server.http.handler.PostCommandHandler.execute(PostCommandHandler.java:117)
at com.arcadedb.server.http.handler.DatabaseAbstractHandler.execute(DatabaseAbstractHandler.java:100)
at com.arcadedb.server.http.handler.AbstractServerHttpHandler.handleRequest(AbstractServerHttpHandler.java:127)
at io.undertow.server.Connectors.executeRootHandler(Connectors.java:393)
at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:859)
at org.jboss.threads.ContextHandler$1.runWith(ContextHandler.java:18)
at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2513)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1538)
at org.xnio.XnioWorker$WorkerThreadFactory$1$1.run(XnioWorker.java:1282)
at java.base/java.lang.Thread.run(Thread.java:840)
Steps to reproduce
This works:
IMPORT DATABASE file://empty.csv WITH vertices="file://vertices.csv", verticesFileType=csv, typeIdProperty=Id, typeIdType=Long, edges="file://edges.csv", edgesFileType=csv, edgeFromField="From", edgeToField="To"
but it needs a dummy empty.csv (an empty file) to avoid useless Document insertions and typeIdType=Long to ensure vertex id types match edge "from" and "to" types.
PS: I am using vertices.csv and edges.csv.
Did you find a workaround for this? I have a large dataset that would take some compute to map String IDs to Long IDs. I got " found schema property Node.Id of type STRING, while analyzing the source type LONG was found".
All the nodes were created but it failed to add any edges.
@lvca Do you have any planned development for configuring link types?
Looks to be related to this hardcoded line: https://github.com/ArcadeData/arcadedb/blob/df72f59e9b9ebb07b0bf0c7adbbd6eedc4bbb44d/integration/src/test/java/com/arcadedb/integration/importer/CSVImporterIT.java#L64
No, unfortunately not yet.