graphql
graphql copied to clipboard
cypher query for graphql `OR` operation
Describe the bug
The OR
operation of the graphQL query seems to be generating cypher queries that take too long to process
e.g. I have a simple graph where I have child or parent relationships between terms in ontologies (cell ontology is what I use).
each node in this graph contains id, name, source and version properties. The schema looks something like this.
Now I want to find a set of terms (nodes) in this graph.
query {
ontoTerms(
where: {
OR: [
{ id: "CL:1001610", source: "Cell Ontology", version: "v2021-06-21" }
{ id: "CL:0000037", source: "Cell Ontology", version: "v2021-06-21" }
{ id: "CL:0000805", source: "Cell Ontology", version: "v2021-06-21" }
{ id: "CL:0002354", source: "Cell Ontology", version: "v2021-06-21" }
{ id: "CL:0000035", source: "Cell Ontology", version: "v2021-06-21" }
{ id: "CL:0000988", source: "Cell Ontology", version: "v2021-06-21" }
]
}
) {
id
source
version
}
}
This graphql query with the current driver translates to following cypher query
MATCH (this:OntoTerm)
WHERE (this.id = $this_OR_id AND this.version = $this_OR_version AND this.source = $this_OR_source OR this.id = $this_OR1_id AND this.version = $this_OR1_version AND this.source = $this_OR1_source OR this.id = $this_OR2_id AND this.version = $this_OR2_version AND this.source = $this_OR2_source OR this.id = $this_OR3_id AND this.version = $this_OR3_version AND this.source = $this_OR3_source OR this.id = $this_OR4_id AND this.version = $this_OR4_version AND this.source = $this_OR4_source OR this.id = $this_OR5_id AND this.version = $this_OR5_version AND this.source = $this_OR5_source)
RETURN this { .id, .source, .version } as this
replacing the variables with actual values
MATCH (this:OntoTerm)
WHERE (this.id = "CL:1001610" AND this.version = "v2021-06-21" AND this.source = "Cell Ontology" OR
this.id = "CL:0000037" AND this.version = "v2021-06-21" AND this.source = "Cell Ontology" OR
this.id = "CL:0000805" AND this.version = "v2021-06-21" AND this.source = "Cell Ontology" OR
this.id = "CL:0002354" AND this.version = "v2021-06-21" AND this.source = "Cell Ontology" OR
this.id = "CL:0000035" AND this.version = "v2021-06-21" AND this.source = "Cell Ontology" OR
this.id = "CL:0000988" AND this.version = "v2021-06-21" AND this.source = "Cell Ontology")
RETURN this { .id, .source, .version } as this
Type definitions
const typeDefs = gql`
type OntoTerm {
name: String
id: String
definition: String
version: String
source: String
child: [OntoTerm] @relationship(type: "child", direction: OUT)
parent: [OntoTerm] @relationship(type: "parent", direction: OUT)
}
`
To Reproduce probably any graph that has a few nodes would work
Expected behavior Instead I would expect multiple queries optional matches or unions
MATCH (o:OntoTerm)
where o.id = "CL:1001610" AND o.source = "Cell Ontology" AND o.version = "v2021-06-21"
return o
UNION
MATCH (o:OntoTerm)
where o.id = "CL:0000037" AND o.source = "Cell Ontology" AND o.version = "v2021-06-21"
return o
....
....
and so on for the remaining
....
or something like this
MATCH (o1:OntoTerm)
where o1.id = "CL:1001610" AND o1.source = "Cell Ontology" AND o1.version = "v2021-06-21"
OPTIONAL MATCH (o2:OntoTerm)
where o2.id = "CL:0000037" AND o2.source = "Cell Ontology" AND o2.version = "v2021-06-21"
return o1, o2
Screenshots If applicable, add screenshots to help explain your problem.
System (please complete the following information):
- OS: linux
- Version: [e.g. @neo4j/[email protected]]
- Node.js version: [e.g. 17]
Additional Information
Things I've tried/questions
- Can we customize what the underlying queries get translated to without having to write a full resolver ?
- I do index all nodes by id, source and version.
- the default query takes significantly longer compared to individual matches/union operations.
Many thanks for raising this bug report @jkanche. :bug: We will now attempt to reproduce the bug based on the steps you have provided.
Please ensure that you've provided the necessary information for a minimal reproduction, including but not limited to:
- Type definitions
- Resolvers
- Query and/or Mutation (or multiple) needed to reproduce
If you have a support agreement with Neo4j, please link this GitHub issue to a new or existing Zendesk ticket.
Thanks again! :pray:
Have you tried rewriting your query using the IN operator? The reason the query is so slow is because the db is making 6 separate calls. If you batch the ids, you can do it all in one retrieval.
Something like this should work based on the autogenerated properties:
query {
ontoTerms(
where: {
AND: [
{ id_IN: ["CL:1001610", "CL:0000037", "CL:0000805", "CL:0002354", "CL:0000035", "CL:0000988"] },
{ source: "Cell Ontology" },
{ version: "v2021-06-21" },
]
}
) {
id
source
version
}
}
Although my example has version and source same across terms, they could be different