neo4j-core icon indicating copy to clipboard operation
neo4j-core copied to clipboard

Unexpected behavior: #union_cypher returns String object rather than Query object

Open jorroll opened this issue 7 years ago • 9 comments

While attempting to create a query involving the union of three sub-queries, I discovered that union_cypher expects a query object but returns a string rather than another query object. This prevents me from performing a union query on a union query. Given that all the other query methods return a query object, is there a reason for this behavior? I guess it kinda explains the name union_cypher

https://github.com/neo4jrb/neo4j-core/blob/00190f40810b8fbb559ede1592273501d1230349/lib/neo4j-core/query.rb#L376-L386

How would you feel about a PR to either create a new query method union that returns another query object for further chaining?

Alternatively (and easier to implement), union_cypher could be updated so that it tests to see if its argument is already a string and, if so, simply inserts it without first calling to_cypher on the argument.

Thoughts?

jorroll avatar Dec 15 '17 01:12 jorroll

I'd be fine with a union method that returns a Query object, though I have a lot of questions (which is part of why I didn't make it in the first place):

  • When you chain, do the new clauses go into the last query in the UNION or onto all queries in the UNION?
  • Would the method validate that the name number of columns exist in both queries (and maybe that they are the same names, I don't know what Neo4j's UNION requires)?

Generally, I think the main reason I didn't implement a UNION is because the underlying structure of Query doesn't feel like it jives with a Query representing a UNION

I think I'd be fine with union_cypher taking a string, but what about the idea of union_cypher taking multiple Query arguments (perhaps via a def union_cypher(*others, options = {}) definition)?

cheerfulstoic avatar Dec 15 '17 13:12 cheerfulstoic

what about the idea of union_cypher taking multiple Query arguments (perhaps via a def union_cypher(*others, options = {}) definition)?

I like that 👍 .

Regarding a union query method, I see what you mean about confusing to implement. My main annoyance with union_cypher is that I need to put it into Neo4j::ActiveNode.query(). I can't simply call .exec or pluck() on it.

Maybe it would be best to leave a union() method out, but the implementation that comes to mind would be:

  1. Calling .union() on a query always adds a new union clause to the query. Once added, that union clause cannot be modified. Calling query_as(:n).match(n: {color: 'purple'}).union(query_as(:n).match(n: {color: 'blue'})) would be the same as query_as(:n).union(query_as(:n).match(n: {color: 'blue'})).match(n: {color: 'purple'}). I.e. additional query clause methods modify the original query, ignoring any union clauses.

  2. It looks like Neo4j expects all union returns to have the same columns (both name and number). So it would make sense to validate that and also: if you plucked the root query directly (.pluck()) to simply overwrite the return of the root and all unions so that they returned whatever was in the pluck.

Does this make sense? Thoughts?

jorroll avatar Dec 21 '17 00:12 jorroll

Yeah, I would be fine with what you’re proposing, though I would warn that the clause logic can get a bit complicated if you were thinking about implementing this.

Another thought: union_pluck and union_return which do the same as union_cypher but pluck / return for you.

cheerfulstoic avatar Dec 21 '17 14:12 cheerfulstoic

I see what you mean about the logic being confusing. I've got an implementation of .union and .union_all() query clause methods working in #309. Tests still need to be written and the code cleaned up, but I wanted to run it by you first.

I also tried to implement an improved union_cypher, before realizing that def union_cypher(*others, options = {}) wouldn't work, as options will never get a value. Not 100% sure if ruby would even let me put a splat before a regular argument like that. I decided not to pursue it further since I think .union & .union_all will provide better replacements.

jorroll avatar Dec 28 '17 06:12 jorroll

PS: I've seen you mention a few times how you have mixed feelings about neo4j-core's automatic query clause ordering (occasionally necessitating the use of.break). I'll admit, upon first usage I was surprised by the feature and it took me a bit to figure out what was going on. This being said, once I figured out what was going on, I found the api to be easy to work with, and far more powerful than an api without query ordering. I've definitely made use of the feature in a few places, and it would be very tedious / annoying to replicate on my own (but it's very easy to work around with .break, on the occasions I don't want it). My two cents.

jorroll avatar Dec 29 '17 06:12 jorroll

@thefliik That's fair, though I think there might be room for a hybrid implementation. Like certain clauses should combine, but not others. Like if you have multiple where method calls in order those should probably combine. But if there's a match in between, they shouldn't because WHERE is actually a modifier of MATCH (most people don't realize this and it causes strange issues because if you try do do MATCH ... WHERE ... MATCH ... WHERE ... and the second WHERE refers to a variable which was created in the first MATCH, you won't actually get filtering.

cheerfulstoic avatar Jan 05 '18 21:01 cheerfulstoic

Something that I just thought of: It's all fine to be able to understand a tool once you've figured it out, but in order to increase adoption I think it's important to make sure that things are as easy as possible (without compromising the design, of course)

cheerfulstoic avatar Jan 06 '18 21:01 cheerfulstoic

That's a great point. So as long as we're talking about "wouldn't it be great if." It'd probably be best if match clauses, by default, never re-arranged themselves, but there was an option to turn on re-arranging of some kind (something that more advanced users could discover).

jorroll avatar Jan 06 '18 22:01 jorroll

Yeah, probably MATCH clauses shouldn't rearrange themselves. There's a trickier question, perhaps, in if sequential match calls should be combined into one MATCH clause, since there is actually a difference between specifying two MATCH clauses and putting the same clauses into a single MATCH and separating them with commas. See this SO question and it's answers for details:

https://stackoverflow.com/questions/34089692/what-does-a-comma-in-a-cypher-query-do

cheerfulstoic avatar Jan 10 '18 02:01 cheerfulstoic