kuzu
kuzu copied to clipboard
Inconsistent Behavior between Kuzu and Neo4j in Handling Self-Loops
Hello there,
I've encountered some peculiar behavior while testing Kuzu on a graph with self-loops. Specifically, certain queries seem to yield unexpected results.
Additionally, I ran the same queries in Neo4j and noticed a discrepancy. It appears that Kuzu is attempting to return duplicated results exponentially.
Here's a simplified version for Command Line:
CREATE NODE TABLE User(name STRING, age INT64, PRIMARY KEY (name));
CREATE REL TABLE Follows(FROM User TO User, since INT64);
MATCH (n1:User), (n2:User) WHERE (n1.name = "Alice") AND (n2.name = "Alice") CREATE (n1)-[:Follows]->(n2);
MATCH (n1)-[]-(n1) RETURN *;
-----------------------------------------
| n1 |
-----------------------------------------
| {_ID: 0:0, _LABEL: User, name: Alice} |
-----------------------------------------
| {_ID: 0:0, _LABEL: User, name: Alice} |
-----------------------------------------
MATCH (n1)-[]-(n1)-[]-(n1) RETURN *;
-----------------------------------------
| n1 |
-----------------------------------------
| {_ID: 0:0, _LABEL: User, name: Alice} |
-----------------------------------------
| {_ID: 0:0, _LABEL: User, name: Alice} |
-----------------------------------------
| {_ID: 0:0, _LABEL: User, name: Alice} |
-----------------------------------------
| {_ID: 0:0, _LABEL: User, name: Alice} |
-----------------------------------------
For Neo4j, the first query will return only one line and the second query will return an empty set cause Neo4j does not allow matching duplicated relationships in MATCH
.
For Kuzu, I think the first query should return one line but two lines are found. For the second query, I think both 1 line and 0 line is OK, but 4 lines looks unexpected to me.
I suspect there is a bug when handling the self-loop in Kuzu, and I have tried to execute an additional query for verification:
MATCH (n1)-[]-(n1)-[]-(n1)-[]-(n1)-[]-(n1)-[]-(n1)-[]-(n1)-[]-(n1)-[]-(n1)-[]-(n1)-[]-(n1)-[]-(n1) RETURN COUNT(*);
----------------
| COUNT_STAR() |
----------------
| 2048 |
----------------
As you can see, the query yields a searching space with exponential size for numerous | {_ID: 0:0, _LABEL: User, name: Alice} |
, but only the graph contains one node Alice
and only one line is needed.
It would be highly appreciated if you could further verify and investigate this report.
Best regards, Qiuyang