openCypher icon indicating copy to clipboard operation
openCypher copied to clipboard

Specify order of list elements in list constructing functions

Open jmarton opened this issue 8 years ago • 2 comments

CIR-2017-220

Specify order of list elements in list constructing functions

Overview

Using the collect() aggregation function, one can create a list of items that fall in a certain group. Items come from the MATCHed graph objects or from previous subqueries. However, the order of elements is not explicitly specified neither in the function call, nor in the order of processing (which absolutely makes sense).

Request

Explicit specification for the list item sorting should be given using an optional ORDER BY specifier, sorting in ascending order, if explicit direction is not given.

For the sorting expression, the current restrictions apply: sorting can't be done on graph nodes and relationships, but it can indeed be done on their properties, or in general, on expressions that evaluate to comparable result.

Proposal for current syntax

collect( expression [ ORDER BY expression [ ASC[ENDING] | DESC[ENDING] ] ] )

Proposal for the new aggregation syntax proposed in CIP2017-04-13 (#218)

collect OF expression [ ORDER BY expression [ ASC[ENDING] | DESC[ENDING] ] ]

Examples

Example 1: collect items to a single list

For demonstration purposes, we use a locally created list as item source. By unwinding, 1st subquery streams its result of 10 records to the 2nd subquery which then collects it. Current Neo4j v3.1 seems to collect items in the same order the 1st subquery returned items. However, when it comes to parallel query execution, this assumption might be a bottleneck.

UNWIND range(1, 10) AS i     // 1st subquery
RETURN collect(i) AS l       // 2nd subquery

Explicit ordering could be done as follows.

For the current syntax:

UNWIND range(1, 10) as i                    // 1st subquery
RETURN collect(i ORDER BY i ASC) AS l       // 2nd subquery

For CIP2017-04-13 syntax

UNWIND range(1, 10) as i                      // 1st subquery
RETURN collect OF i ORDER BY i ASC AS l       // 2nd subquery

Example 2: group graph objects based on parity of their id-s

We create 10 nodes with the label CollectExample having ids ranging from 1 to 10:

UNWIND range(1, 10) AS i
CREATE (:CollectExample {id: i})

Then we collect nodes grouped by parity of their id property:

MATCH (n:CollectExample)     // 1st subquery
WITH n.id%2=1 AS odd, n
RETURN odd, collect(n) AS l  // 2nd subquery

Explicit descending ordering on the node id could be done as follows.

For the current syntax:

MATCH (n:CollectExample)                         // 1st subquery
WITH n.id%2=1 AS odd, n
RETURN odd, collect(n ORDER BY n.id DESC ) AS l  // 2nd subquery

For CIP2017-04-13 syntax

MATCH (n:CollectExample)                          // 1st subquery
WITH n.id%2=1 AS odd, n
RETURN odd, collect OF n ORDER BY n.id DESC AS l  // 2nd subquery

jmarton avatar Apr 18 '17 10:04 jmarton

Related to #190

Mats-SX avatar Apr 18 '17 11:04 Mats-SX

For the sorting expression, the current restrictions apply: sorting can't be done on graph nodes and relationships, but it can indeed be done on their properties, or in general, on expressions that evaluate to comparable result.

The comparability CIP actually defines that sorting/ordering is defined for all values, also entities.

Mats-SX avatar Apr 18 '17 11:04 Mats-SX