[BUG] labelFilter for subgraph_all doesn't work as documented
Describe the bug
In the documentation, examples for subgraph_all are given. Unfortunately, they don't work as described.
To Reproduce Run the following query:
CREATE (w:Wolf)-[ca:CATCHES]->(d:Dog), (c:Cat), (m:Mouse), (h:Human);
MATCH (w:Wolf), (d:Dog), (c:Cat), (m:Mouse), (h:Human)
WITH w, d, c, m, h
CREATE (d)-[:CATCHES]->(c)
CREATE (c)-[:CATCHES]->(m)
CREATE (d)-[:FRIENDS_WITH]->(m)
CREATE (h)-[:OWNS]->(d)
CREATE (h)-[:HUNTS]->(w)
CREATE (h)-[:HATES]->(m);
MATCH (dog:Dog)
CALL path.subgraph_all(dog, {
labelFilter: ["/Mouse"]
})
YIELD nodes AS nodes, rels AS rels
RETURN nodes, rels
This returns only a single node, Mouse:
{
"nodes": [
{ "id": 13, "labels": ["Mouse"], "properties": {}, "type": "node" }
],
"rels": []
}
Expected behavior
According to the documentation, this filtering will return 3 paths: Dog->Cat->Mouse, Dog<-Human->Mouse, Dog->Mouse.
Additional context I tried this in the newest Memgraph version 3.2.1.
Thank you @thomas-tran-de for reporting the issue. I managed to reproduce it and can confirm that the same behavior happens to me as well, and it is opposite to what's described.
Are you blocked by this currently?
Thanks for the quick reply @katarinasupe. No, I'm not blocked by this. I came up with a workaround for my specific use case which isn't as elegant but leads to the same result. I'll switch back to my initial idea when path.subgraph_all is fixed.
Hi @katarinasupe, this just got more important to me.
I went through the label filter examples in the documentation and none of them work.
Could the issue be AreLabelsValid (Link)? It looks too restrictive. Whenever a termination or end node is used, only those will be returned. This can be illustrated by specifying all intermediate nodes as end nodes:
| Starting Node | Label Filter | Paths Returned | Paths Expected | Workaround Filter |
|---|---|---|---|---|
| Dog | ["/Mouse"] |
Mouse |
Dog->Cat->Mouse, Dog<-Human->Mouse, Dog->Mouse |
[">Dog", "/Mouse", ">Cat", ">Human"] |
| Dog | ["/Mouse", "Cat"] |
None | Dog->Mouse, Dog->Cat->Mouse |
[">Dog", "/Mouse", ">Cat"] |
| Dog | ["/Mouse", "-Cat", "-Human"] |
Mouse |
Dog->Mouse |
[">Dog", "/Mouse"] |
| Cat | [">Dog", "+Human", "+Wolf"] |
None | Cat<-Dog, Cat<-Dog<-Human->Wolf->Dog |
[">Cat", ">Dog", ">Human", ">Wolf"] |
Maybe !config_.label_bools_status.termination_activated && !config_.label_bools_status.end_node_activated && Whitelisted(label_bools.whitelisted) just needs to be replaced with Whitelisted(label_bools.whitelisted)? As a user, I expect the whitelist to work independent of termination and end nodes.
Two more issues I just encountered:
- The default for
filterStartNodeistruecontrary to what the documentation says. - Even when setting
filterStartNodetofalse, the graph stops expanding at the initial node if it's not explicitly whitelisted. This issue is likely becauseContinueExpandingdoesn't handlefilterStartNodeat all. It is called inBFSfor all nodes.
To reproduce, use the example graph from the documentation.
| Query | Result | Expected |
|---|---|---|
MATCH (dog:Dog)CALL path.subgraph_all(dog, {labelFilter: ["Cat"]})YIELD nodes, relsRETURN nodes, rels |
nothing | Dog->Cat |
MATCH (dog:Dog)CALL path.subgraph_all(dog, {labelFilter: ["Cat"], filterStartNode: false})YIELD nodes, relsRETURN nodes, rels |
Dog |
Dog->Cat |
MATCH (dog:Dog)CALL path.subgraph_all(dog, {labelFilter: ["Cat", "Dog"]})YIELD nodes, relsRETURN nodes, rels |
Dog->Cat |
Dog->Cat |
MATCH (dog:Dog)CALL path.subgraph_all(dog, {labelFilter: ["Cat", "-Dog"]})YIELD nodes, relsRETURN nodes, rels |
nothing | Dog->Cat |
MATCH (dog:Dog)CALL path.subgraph_all(dog, {labelFilter: ["Cat", "-Dog"], filterStartNode: false})YIELD nodes, relsRETURN nodes, rels |
Dog |
Dog->Cat |
This is now blocking me from further development. Do you know when someone has time to look into this, @katarinasupe?
Hi @thomas-tran-de, I've forwarded the info and the team is aware. This is currently not a priority and they will get back to you as soon as they can take a deeper look.
I did some more investigating. Now that I had a closer look at the code, I realized that the documentation primarily focuses on path.expand, not path.subgraph_*. Here are the key things that threw me off:
-
labelFilteris used to check which paths to traverse but it isn't strictly related to which nodes are returned. If using end or termination nodes, only these will be returned even if other labels are part of the whitelist. - The documentation states that "expansion will only go through nodes with whitelisted labels" for end nodes. This is also true for termination nodes.
- The first node will always be returned if
filterStartNode == false. This is not related to expanding the graph though. I found this very confusing and that's why I created a pull request. - If
relationshipFiltercontains an edge going both directions, i.e.,<EDGE>, it will always return the corresponding child node regardless oflabelFilter. - Like
labelFilter,relationshipFilterwill only determine how to find relevant paths. The returned relationships forsubgraph_allare just all relationships between the returned nodes.