openCypher icon indicating copy to clipboard operation
openCypher copied to clipboard

Adding support for multiple graphs

Open boggle opened this issue 8 years ago • 1 comments

CIR-2017-182

Supporting multiple graphs from within the same Cypher query would massively increase the power and expressivity of the language. This CIR asks the community to help us explore this idea at greater depth.

Background

For the purpose of this CIR, we assume an extended version of the property graph model.

  • There is a new called a graph
  • Graphs have properties
  • Graphs have labels
  • A graph can contain many nodes and relationships
  • Every node and relationship is contained in one or more graphs

Requirements

For this CIR, we're looking for a wide set of proposals. Therefore we do not ask for many requirements. However, it is expected that a full proposal would touch on a sensible set of the following topics in some form:

  • Passing graphs as input to Cypher
  • Receving graphs as output from Cypher
  • Combining graphs (e.g. set operations)
  • Dynamically creating graphs from queries
  • Possible changes to the overall language execution model
  • Querying multiple graphs explicitly within the same query
  • Updating graphs
  • Updating graph membership (which nodes/relationships are part of a graph)
  • Representation of graphs inside a Cypher query (as value? as a context?)

Considerations

Furthermore, proposals are invited to cover the following additional facets

  • Graphs as entities (i.e. they may have an identity, like nodes and relationships)
  • Views
  • Addressing graphs
  • Federation/Cross-database operation
  • Access control

Thank you very much!

boggle avatar Feb 03 '17 12:02 boggle

AS IS: Cypher 9

Each Cypher 9 query is a juxtaposition of its clauses and each clause is a function transforming a bag of environments to bag of environments where an environment is a binding between names and values of following types:

  • primitives: integers, floats, booleans, strings
  • entities: nodes, relationship, path
  • nested structures: maps and lists with entities and primitives as leaf values

Execution of query is folding of subsequent clauses executions starting with singleton bag of empty environment (no binding) performed in context of global single graph database with possible changes in the database.

Therefore, in this sense, each Cypher 9 clause is a compilation unit which can be sent to execution to the software agent, with environment passed as argument and environment received as a result.

Table of given clause type impact on environment scope and cardinality of bag of environments.

Clause Environment scope Cardinality of bag of environments
CREATE preserves or enriches preserves
DELETE preserves* preserves
MATCH preserves or enriches regular: reduces, preserves or increases optional: preserves or increases
MERGE preserves or enriches preserves or increases
REMOVE preserves preserves
SET preserves preserves
UNWIND enriches multiplies (including zeros)
WITH overwrites preserves
RETURN overwrites preserves
WHERE preserves preserves or reduces

*see issue: 263

Futhermore, each subsquence of clauses of valid query Cypher 9 is a compilation unit as juxtaposition of subsequent clauses.

TO BE: Multiple graphs in Cypher 10

We enhance a query context from single implicit graph to multiple named graphs. WITH clause is extended with section of graph context, e.g.:

WITH GRAPH 
  States at 'state_location',
  Couties at 'counties_counties',
  Addresses at 'adresses_location',
  Taxes at 'taxes_location',
  $sn as SocialNetwork

Names of graphs are supposed to be used as virtual labels for node pattern matching in MATCH, MERGE, CREATE clauses. Such label points to database the node should be looked for. The label is applied to node pattern with operator ::, e.g.:

MATCH (n:Person::SocialNetwork)
RETURN n 

A shortcut notation is possible in case of a clause in context of one graph:

IN SocialNetwork MATCH (n:Person)-[r:FRIEND]->(m)
RETURN n,r,m

We may create a relationship between nodes in two different graphs:

MATCH (p:Person::SocialNetwork), (a:Address::Addresses)
CREATE (p)-[:LIVES]->(a)

Such relationship can be retrieved in Cypher 10 multigraph query:

MATCH path=(:Person::SocialNetwork)-[:LIVES]-(:Address::Addresses)
RETURN path

Virtual labels can be used in WHERE clause analogously to regular labels.

MATCH path=(p:Person)-[:LIVES]-(a:Address)
WHERE p::SocialNetwork and a::Addresess
RETURN path

Summary of changes in the language:

  1. Clause WITH is enriched with GRAPH component, where graphs are globally named and declared for further usage by reference
  2. Names of graphs are supposed to be used as virtual labels for node pattern matching in MATCH, MERGE, CREATE clauses. Such label points to database the node should be looked for or created.
  3. Relationship can be created between nodes in two different graphs
  4. Every clause can have shortcut for default source/target clause modifier for all patterns in a clause: IN graph_name MATCH

JanekPo avatar Nov 14 '17 16:11 JanekPo