graphql icon indicating copy to clipboard operation
graphql copied to clipboard

cypher query for graphql `OR` operation

Open jkanche opened this issue 2 years ago • 3 comments

Describe the bug The OR operation of the graphQL query seems to be generating cypher queries that take too long to process

e.g. I have a simple graph where I have child or parent relationships between terms in ontologies (cell ontology is what I use).

each node in this graph contains id, name, source and version properties. The schema looks something like this.

image

Now I want to find a set of terms (nodes) in this graph.

query {
	ontoTerms(
		where: {
			OR: [
				{ id: "CL:1001610", source: "Cell Ontology", version: "v2021-06-21" }
				{ id: "CL:0000037", source: "Cell Ontology", version: "v2021-06-21" }
				{ id: "CL:0000805", source: "Cell Ontology", version: "v2021-06-21" }
				{ id: "CL:0002354", source: "Cell Ontology", version: "v2021-06-21" }
				{ id: "CL:0000035", source: "Cell Ontology", version: "v2021-06-21" }
				{ id: "CL:0000988", source: "Cell Ontology", version: "v2021-06-21" }
			]
		}
	) {
		id
		source
		version
	}
}

This graphql query with the current driver translates to following cypher query

MATCH (this:OntoTerm)
WHERE (this.id = $this_OR_id AND this.version = $this_OR_version AND this.source = $this_OR_source OR this.id = $this_OR1_id AND this.version = $this_OR1_version AND this.source = $this_OR1_source OR this.id = $this_OR2_id AND this.version = $this_OR2_version AND this.source = $this_OR2_source OR this.id = $this_OR3_id AND this.version = $this_OR3_version AND this.source = $this_OR3_source OR this.id = $this_OR4_id AND this.version = $this_OR4_version AND this.source = $this_OR4_source OR this.id = $this_OR5_id AND this.version = $this_OR5_version AND this.source = $this_OR5_source)
RETURN this { .id, .source, .version } as this

replacing the variables with actual values

MATCH (this:OntoTerm)
WHERE (this.id = "CL:1001610" AND this.version = "v2021-06-21" AND this.source = "Cell Ontology" OR 
    this.id = "CL:0000037" AND this.version = "v2021-06-21" AND this.source = "Cell Ontology" OR 
    this.id = "CL:0000805" AND this.version = "v2021-06-21" AND this.source = "Cell Ontology" OR 
    this.id = "CL:0002354" AND this.version = "v2021-06-21" AND this.source = "Cell Ontology" OR 
    this.id = "CL:0000035" AND this.version = "v2021-06-21" AND this.source = "Cell Ontology" OR 
    this.id = "CL:0000988" AND this.version = "v2021-06-21" AND this.source = "Cell Ontology")
RETURN this { .id, .source, .version } as this

Type definitions

const typeDefs = gql`
    type OntoTerm {
        name: String
        id: String
        definition: String
        version: String
        source: String
        child: [OntoTerm] @relationship(type: "child", direction: OUT)
        parent: [OntoTerm] @relationship(type: "parent", direction: OUT)
    }
`

To Reproduce probably any graph that has a few nodes would work

Expected behavior Instead I would expect multiple queries optional matches or unions

MATCH (o:OntoTerm)
where o.id = "CL:1001610" AND o.source = "Cell Ontology" AND o.version = "v2021-06-21"
return o

UNION 

MATCH (o:OntoTerm)
where o.id = "CL:0000037" AND o.source = "Cell Ontology" AND o.version = "v2021-06-21"
return o

....
....
and so on for the remaining
....

or something like this

MATCH (o1:OntoTerm)
where o1.id = "CL:1001610" AND o1.source = "Cell Ontology" AND o1.version = "v2021-06-21"
OPTIONAL MATCH (o2:OntoTerm)
where o2.id = "CL:0000037" AND o2.source = "Cell Ontology" AND o2.version = "v2021-06-21"
return o1, o2

Screenshots If applicable, add screenshots to help explain your problem.

System (please complete the following information):

Additional Information

Things I've tried/questions

  • Can we customize what the underlying queries get translated to without having to write a full resolver ?
  • I do index all nodes by id, source and version.
  • the default query takes significantly longer compared to individual matches/union operations.

jkanche avatar Mar 25 '22 18:03 jkanche

Many thanks for raising this bug report @jkanche. :bug: We will now attempt to reproduce the bug based on the steps you have provided.

Please ensure that you've provided the necessary information for a minimal reproduction, including but not limited to:

  • Type definitions
  • Resolvers
  • Query and/or Mutation (or multiple) needed to reproduce

If you have a support agreement with Neo4j, please link this GitHub issue to a new or existing Zendesk ticket.

Thanks again! :pray:

neo4j-team-graphql avatar Mar 25 '22 18:03 neo4j-team-graphql

Have you tried rewriting your query using the IN operator? The reason the query is so slow is because the db is making 6 separate calls. If you batch the ids, you can do it all in one retrieval.

Something like this should work based on the autogenerated properties:

query {
  ontoTerms(
    where: {
      AND: [
        { id_IN: ["CL:1001610", "CL:0000037", "CL:0000805", "CL:0002354", "CL:0000035",  "CL:0000988"] },
	{ source: "Cell Ontology" }, 
        { version: "v2021-06-21" },
      ]
    }
  ) {
    id
    source
    version
  }
}

litewarp avatar Mar 26 '22 03:03 litewarp

Although my example has version and source same across terms, they could be different

jkanche avatar Mar 26 '22 15:03 jkanche