neosemantics
neosemantics copied to clipboard
Found multiple nodes with label...while only one was expected.
I am trying to import one of my RDF datasets to Neo4J using this plugin. However, the import procedure gives me the following error:
Neo.ClientError.Procedure.ProcedureCallFailed: Failed to invoke procedure
semantics.importRDF: Caused by: org.neo4j.graphdb.MultipleFoundException: Found multiple nodes with label: 'Resource', property name: 'uri' and property value: 'https://rdf.mydomain.eu/annotation_8_79_1522761126' while only one was expected.
I think it happens when I add duplicate subjects. I assumed if the URIs are the same, NeoSemantics will automatically merge the nodes but it doesn't seem to work! Can you help me with that error?
I use the following command to import:
CALL semantics.importRDF("http://mydomain/000003.ttl", "Turtle",{shortenUrls: true, typesToLabels: true, commitSize: 25000})
could you share the rdf file you're trying to load so I can reproduce?
It happens when I materialize a symmetric relation. E.g.:
@prefix cRQ: <http://example.com/r/question/> . @prefix cR: <http://example.com/r/message/> . @prefix cV: <http://example.com/vocab/> .
cR:1849639 cV:question cRQ:263188 . cRQ:263188 cV:message cR:1849639 .
Then it gives me the error:
Neo.ClientError.Procedure.ProcedureCallFailed: Failed to invoke procedure `semantics.importRDF`: Caused by: org.neo4j.graphdb.MultipleFoundException: Found multiple nodes with label: 'Resource', property name: 'uri' and property value: 'http://example.com/r/question/263188' while only one was expected.
it happens only when I have already inserted some triples about that node. I want to add more triples to the same indexed node (by importing multiple ttl files) but it does not seem to work!
Thanks for coming back on this, @ali1k !
Here's how I'm trying to reproduce it:
I've created three files, one with triples for the first resource cR:1849639
(attached file bug47-a.ttl), another with triples for the second resource cRQ:263188
(attached file bug47-b.ttl) and finally a third with the fragment you shared (attached file bug47-c.ttl) connecting the two with a symmetric relation.
I load them in sequence as you describe...
call semantics.importRDF("https://github.com/jbarrasa/neosemantics/files/3359356/bug47-a.ttl.txt","Turtle");
call semantics.importRDF("https://github.com/jbarrasa/neosemantics/files/3359356/bug47-b.ttl.txt","Turtle");
call semantics.importRDF("https://github.com/jbarrasa/neosemantics/files/3359356/bug47-c.ttl.txt","Turtle");
...but I get no errors, it all works as expected.
What am I doing wrong?
What versions of Neo4j and neosemantics are you working on? Could you help me with a way to reproduce it? Maybe sharing a larger dataset privately?
Thanks!
thanks for the reply @jbarrasa . I am trying to reproduce the bugs but it's hard! with the sample files you shared, it works well! I will come back to you once I could reproduce it. I have a large dataset which is private thus cannot share it here. As soon as I could reproduce it with a smaller sample, will come back to you...
I'm also facing the same issue. In my case, I suspect that multiple imports are being executed in parallel (ajax calls) with RDF files that deal with the same classes (sample files attached).
Failed to invoke procedure semantics.importRDF: Caused by: org.neo4j.graphdb.MultipleFoundException: Found multiple nodes with label: 'Resource', property name: 'uri' and property value: 'http://linkeddata.uni-muenster.de/ontology/musicscore#Quarter' while only one was expected.
If I change my ajax call to async: false
everything works just fine. I'm most likely missing something in my execute query method, since it is very simplistic :-)
public StatementResult executeQuery(String cypher, DataSource ds){
...
Driver driver = GraphDatabase.driver(ds.getHost(),AuthTokens.basic(ds.getUser(),ds.getPassword()));
StatementResult result;
try(Transaction tx = driver.session().beginTransaction())
{
result = tx.run(cypher);
tx.success();
tx.close();
}
driver.session().close();
return result;
}
Any hint would be much appreciated!
Hi, thanks for submitting the issue @jimjonesbr , I'll try to run some parallel executions to try and reproduce but in the mean time it might be worth trying the following.
I understand you've run this fragment before running the RDF import:
CREATE INDEX ON :Resource(uri)
I suggest you run this now:
CREATE CONSTRAINT ON (r:Resource) ASSERT r.uri IS UNIQUE
This should guarantee that no two Resources with the same value for the property uri are created. If your database is already populated and contains duplicates, you may need to either clear it or fix the offending nodes.
Let me know if it helps in any way. I'll test on my side the concurrent execution too.
PS: @ali1k it would be great if you could check if this affects the behaviour you're seeing.
Hi @jbarrasa thanks so much for the quick response! I applied the constraint as you suggested and the previous error message no longer pops up, but a new one :-) I'm checking my script to make sure it's not anything wrong on my side.
org.neo4j.driver.v1.exceptions.ClientException: Failed to invoke procedure `semantics.importRDF`: Caused by: IndexEntryConflictException{propertyValues=( String("http://dbpedia.org/resource/Composer") ), addedNodeId=-1, existingNodeId=129565}
at org.neo4j.driver.internal.util.ErrorUtil.newNeo4jError(ErrorUtil.java:61)
at org.neo4j.driver.internal.async.inbound.InboundMessageDispatcher.handleFailureMessage(InboundMessageDispatcher.java:137)
at org.neo4j.driver.internal.messaging.PackStreamMessageFormatV1$ReaderV1.unpackFailureMessage(PackStreamMessageFormatV1.java:336)
at org.neo4j.driver.internal.messaging.PackStreamMessageFormatV1$ReaderV1.read(PackStreamMessageFormatV1.java:300)
at org.neo4j.driver.internal.async.inbound.InboundMessageHandler.channelRead0(InboundMessageHandler.java:82)
at org.neo4j.driver.internal.async.inbound.InboundMessageHandler.channelRead0(InboundMessageHandler.java:34)
at org.neo4j.driver.internal.shaded.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at org.neo4j.driver.internal.shaded.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
at org.neo4j.driver.internal.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
at org.neo4j.driver.internal.async.inbound.MessageDecoder.channelRead(MessageDecoder.java:39)
at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at org.neo4j.driver.internal.shaded.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
at org.neo4j.driver.internal.shaded.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:297)
at org.neo4j.driver.internal.shaded.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:413)
at org.neo4j.driver.internal.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265)
at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at org.neo4j.driver.internal.shaded.io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1389)
at org.neo4j.driver.internal.shaded.io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1159)
at org.neo4j.driver.internal.shaded.io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1203)
at org.neo4j.driver.internal.shaded.io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489)
at org.neo4j.driver.internal.shaded.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428)
at org.neo4j.driver.internal.shaded.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265)
at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at org.neo4j.driver.internal.shaded.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1414)
at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at org.neo4j.driver.internal.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at org.neo4j.driver.internal.shaded.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:945)
at org.neo4j.driver.internal.shaded.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:146)
at org.neo4j.driver.internal.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
at org.neo4j.driver.internal.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at org.neo4j.driver.internal.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at org.neo4j.driver.internal.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at org.neo4j.driver.internal.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:886)
at org.neo4j.driver.internal.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
Suppressed: org.neo4j.driver.v1.exceptions.ClientException: Transaction rolled back even if marked as successful
As soon as I make some progress I'll report it here.
Hi, it looks like the index might have got corrupted? (https://stackoverflow.com/a/51325475/7776883) It might be worth dropping and recreating the constraint.
However, if you're running multiple threads in parallel and you're using the URI shortening option you will face problems on the NamespacePrefixDefinition
node as the different threads will try to modify it concurrently and the current version is not thread safe.
As a temporary workaround you have two options:
The easiest one is to run your import ignoring namespaces (if this is an option for you). All you need to do is set the param: handleVocabUris:'IGNORE'
If you need to keep the namespaces and you know the set of namespaces that your RDF fragments are going to be importing in the graph, try to pre-populate the NamespacePrefixDefinition
like so:
CREATE (:NamespacePrefixDefinition {
`http://www.example.com/myvoc/1.0.0#`: 'voc1',
`...`:'voc2'
`http://www.w3.org/1999/02/22-rdf-syntax-ns#`: 'rdf'})
then set a uniqueness constraint also on this label (make sure you include the entry for rdf):
CREATE CONSTRAINT ON (n:NamespacePrefixDefinition)
ASSERT n.`http://www.w3.org/1999/02/22-rdf-syntax-ns#` IS UNIQUE
This should work as a temporary fix until the problem is solved. Hopefully in the next release.
PS: If possible it would be great if you could share the data you're loading to try and reproduce and create a unit test out of it.