openCypher
openCypher copied to clipboard
Added the nested, updating, and chained subqueries CIP
This is part of the redesign of Cypher for adding support for working with multiple graphs that targets Cypher 10.
Rendered:
https://github.com/petraselmer/openCypher/blob/CIP-nested-subqueries/cip/1.accepted/CIP2016-06-22-nested-updating-and-chained-subqueries.adoc
I'm not informed enough to offer meaningful feedback here, but as a user of Neo4j, this looks great! Thank you for putting this together. =)
The only thing that made me go "hmm, interesting" was that subqueries' variables (variable bindings) persist into the outer query. How does that work if you have multiple independent subqueries that happen to (coincidentally) share a variable name? (E.g. something generic like user or tweet).
It would feel a bit more intuitive to me if only the RETURN variables persisted. Then, you could naturally rename those with WITH before running another subquery, etc.
But again, I'm not informed enough to know if this feedback is reasonable or flawed.
@aseemk Based on the examples, I'm guessing that it is only the RETURN values that are made available to the outer query. It's just that the language isn't very clear when it says "...Any new variable bindings produced by evaluating the subquery will augment the variable bindings of the initial record...".
Even more so, the intent is to not allow any shadowing, i.e. you cannot rebind variable from the outer query in the RETURN of the nested subquery. This is in line with CALL ... YIELD... and UNWIND ... AS ... semantics which also do not allow to do that. I think it really helps reduce confusion and hopefully nudges the users towards picking more meaningful (distinguishing) names.
+1 (but I guess in SPARQL I can do the same (i.e. I doubt a bit the SPARQL comparison in your proposal ;) ))
I like the simplicity of this proposal, but I wonder how well it scales to other things we would/might want to do with/as subqueries in the future.
I was actually hoping we would be able to replace union with subqueries, rather than just allowing post-union processing by using union in a subquery, because I find union-queries hard to read with the constituent parts not being strongly delineated.
I've also held some hope that we would be able to use sub-queries as parameters to procedures, and there might be other use cases we would want to use sub-queries for that I can't think of right now.
I have added the 'NOT READY FOR REVIEW' label only because we still need to consider path bindings for subqueries. We'll add this in a fortnight or so, when @boggle and I are back from leave.
In the meantime, much content regarding the new DO syntax has been added, so comments welcome.
Hi @zazi
Thanks for your comment. I was wondering if you could provide an example (or link) in which correlated subqueries are supported in SPARQL? According to the W3C link in the CIP, it appears to be the case that 'vanilla' SPARQL cannot support correlated subqueries, but it may very well be the case that a vendor-specific implementation does. If this is the case, I would love to know, and, of course, this shall be added to the CIP.
@petraselmer I'm not 100% an expert in this field. Nevertheless, (from my understanding) I'm supporting the answer to this question http://answers.semanticweb.com/questions/24508/does-sparql-11-support-correlated-subqueries ;)
For readability issues, I would recommend to add all queries in plain, natural language sentences as well.
Furthermore, some example data that illustrate the queries might be helpful (i.e. a small data set and the result sets of the different queries).
@zazi
Furthermore, some example data that illustrate the queries might be helpful (i.e. a small data set and the result sets of the different queries).
This would be solved by the addition of TCK scenarios for this feature (which is planned). Do you agree?
@thobe passing subqueries as arguments to procedures is out of scope for this CIP.
@thobe How do you envision replacing UNION with subqueries? I'm not seeing that. Besides having UNION is very common for people with a SQL background and opens up a natural trajectory for adding other set operations (INTERSECT SET DIFFERENCE)
+1, I'm waiting for this
In which Neo4j release is/will this change be included ?
@ric81 That is currently not known. Do note that this repository manages the specification and standardisation of Cypher as a language, which is separate from Neo4j the graph database (although this decoupling is fairly new and not yet fully mature).
This CIP doesn't contain any formal syntax description in EBNF format.