openCypher icon indicating copy to clipboard operation
openCypher copied to clipboard

UNWIND CIP

Open Mats-SX opened this issue 7 years ago • 8 comments

CIP

Mats-SX avatar Apr 24 '17 10:04 Mats-SX

Extended CIP to also include OPTIONAL UNWIND from #234.

Note on TCK impact of this PR: All language features should be properly grouped into TCK features, with only relevant scenarios in each. Currently many TCK features contain a varied set of scenarios -- these are conventionally named *Acceptance.feature and have a historical background as Neo4j acceptance tests. Curated TCK features use another naming convention, and this PR provides an improvement in this area.

For OPTIONAL UNWIND I simply copied the TCK feature of UNWIND and modified the queries to use OPTIONAL UNWIND. This resulted in several scenarios which exemplify standard UNWIND semantics. Happy to discuss whether this approach is a good idea, or whether the OPTIONAL UNWIND scenarios only should consider cases using empty lists or null.

Mats-SX avatar May 22 '17 15:05 Mats-SX

Regarding OPTIONAL UNWIND, I support the addition.

We've had many users stumble across situations where they UNWIND an empty list and end up wiping out rows. Many still see UNWIND as a kind of iteration mechanism similar to FOREACH, and end up working with it in the same way (and in some situations we have no choice, since only write clauses are allowed in FOREACH) and are then surprised when their empty list ends up wiping out rows on an UNWIND.

I think providing an OPTIONAL approach to UNWIND can give our users more control over what should happen in these situations.

InverseFalcon avatar Sep 17 '19 18:09 InverseFalcon

I am wonder whether something like FOREACH (UNWIND ...) that was more explicit about the iteration behaviour would be helpful? The implicit nature of iteration is a blessing and a curse, perhaps?

alastai avatar Sep 17 '19 20:09 alastai

I don't think combining the two is the right answer. UNWIND isn't iteration, and it's not bounded the way FOREACH is, and I think that's okay.

And whether the user is approaching UNWIND with iteration in mind or not, the wiping out of rows when the list is empty can be surprising, and requires a CASE dance to use a [null] for the structure in order to get around it. I don't want to change existing behavior, as that is useful too, but having a quick and easy way to guard against wiping out the row on an empty list UNWIND (in cases where you need that fallback) would be helpful.

InverseFalcon avatar Sep 17 '19 20:09 InverseFalcon

@InverseFalcon, @alastai: The thinking of the Cypher designers at Neo4j has been to deprecate (and eventually remove) FOREACH in favour of subqueries that do not affect cardinality (tentative keyword DO) in combination with UNWIND. So that you would spell iteration in the following manner:

MATCH (x:Bar)
DO {
    UNWIND $someCollection AS item
    CREATE (y:Foo{name:item})
    CREATE (x)-[:LIKES]->(y)
}

thobe avatar Sep 18 '19 07:09 thobe

@thobe No complaints there, that looks good to me.

And that will cover many of the problem cases, since it looks like that wouldn't change cardinality.

I still think there will be cases where we couldn't use this approach, as the bounds of the DO {} block would mess up what we eventually need to perform or return later, so an OPTIONAL UNWIND may still be needed.

InverseFalcon avatar Sep 18 '19 15:09 InverseFalcon

@InverseFalcon Indeed, OPTIONAL UNWIND is definitely useful, and unrelated to replacing FOREACH by DO { UNWIND ... }.

There is a general idea to allow OPTIONAL as a prefix for all keywords that can have an effect on cardinality:

  • (OPTIONAL) MATCH
  • (OPTIONAL) UNWIND
  • (OPTIONAL) CALL

As well as allowing OPTIONAL around a whole subquery.

thobe avatar Sep 18 '19 15:09 thobe

I like the generality of the OPTIONAL as its own clause (or clause modifier). In terms of specification, I don't think we need to immediately define it like that, but we could let UNWIND and OPTIONAL UNWIND get specified similarly to this CIP and later on lift OPTIONAL to that more general state.

MERGE is another clause that affects cardinality, but it doesn't have the zero-effect, since it then does a CREATE-like operation and still retains cardinality. (Just mentioning it for inclusion.)

Mats-SX avatar Sep 19 '19 07:09 Mats-SX