openCypher
openCypher copied to clipboard
UNWIND CIP
Extended CIP to also include OPTIONAL UNWIND
from #234.
Note on TCK impact of this PR:
All language features should be properly grouped into TCK features, with only relevant scenarios in each. Currently many TCK features contain a varied set of scenarios -- these are conventionally named *Acceptance.feature
and have a historical background as Neo4j acceptance tests. Curated TCK features use another naming convention, and this PR provides an improvement in this area.
For OPTIONAL UNWIND
I simply copied the TCK feature of UNWIND
and modified the queries to use OPTIONAL UNWIND
. This resulted in several scenarios which exemplify standard UNWIND
semantics. Happy to discuss whether this approach is a good idea, or whether the OPTIONAL UNWIND
scenarios only should consider cases using empty lists or null
.
Regarding OPTIONAL UNWIND, I support the addition.
We've had many users stumble across situations where they UNWIND an empty list and end up wiping out rows. Many still see UNWIND as a kind of iteration mechanism similar to FOREACH, and end up working with it in the same way (and in some situations we have no choice, since only write clauses are allowed in FOREACH) and are then surprised when their empty list ends up wiping out rows on an UNWIND.
I think providing an OPTIONAL approach to UNWIND can give our users more control over what should happen in these situations.
I am wonder whether something like FOREACH (UNWIND ...) that was more explicit about the iteration behaviour would be helpful? The implicit nature of iteration is a blessing and a curse, perhaps?
I don't think combining the two is the right answer. UNWIND isn't iteration, and it's not bounded the way FOREACH is, and I think that's okay.
And whether the user is approaching UNWIND with iteration in mind or not, the wiping out of rows when the list is empty can be surprising, and requires a CASE dance to use a [null]
for the structure in order to get around it. I don't want to change existing behavior, as that is useful too, but having a quick and easy way to guard against wiping out the row on an empty list UNWIND (in cases where you need that fallback) would be helpful.
@InverseFalcon, @alastai: The thinking of the Cypher designers at Neo4j has been to deprecate (and eventually remove) FOREACH
in favour of subqueries that do not affect cardinality (tentative keyword DO
) in combination with UNWIND
. So that you would spell iteration in the following manner:
MATCH (x:Bar)
DO {
UNWIND $someCollection AS item
CREATE (y:Foo{name:item})
CREATE (x)-[:LIKES]->(y)
}
@thobe No complaints there, that looks good to me.
And that will cover many of the problem cases, since it looks like that wouldn't change cardinality.
I still think there will be cases where we couldn't use this approach, as the bounds of the DO {} block would mess up what we eventually need to perform or return later, so an OPTIONAL UNWIND may still be needed.
@InverseFalcon Indeed, OPTIONAL UNWIND
is definitely useful, and unrelated to replacing FOREACH
by DO { UNWIND ... }
.
There is a general idea to allow OPTIONAL
as a prefix for all keywords that can have an effect on cardinality:
- (
OPTIONAL
)MATCH
- (
OPTIONAL
)UNWIND
- (
OPTIONAL
)CALL
As well as allowing OPTIONAL
around a whole subquery.
I like the generality of the OPTIONAL
as its own clause (or clause modifier). In terms of specification, I don't think we need to immediately define it like that, but we could let UNWIND
and OPTIONAL UNWIND
get specified similarly to this CIP and later on lift OPTIONAL
to that more general state.
MERGE
is another clause that affects cardinality, but it doesn't have the zero-effect, since it then does a CREATE
-like operation and still retains cardinality. (Just mentioning it for inclusion.)