age icon indicating copy to clipboard operation
age copied to clipboard

OPTIONAL MATCH should act as outer relationship, acts as cross join

Open MironAtHome opened this issue 1 year ago • 4 comments

Describe the bug Statement below does not return rows, should return LOAD 'age'; SET search_path TO ag_catalog; SELECT create_graph('test_graph'); -- SELECT drop_label('test_graph', 'vertice1'); -- SELECT drop_label('test_graph', 'vertice2'); -- SELECT drop_label('test_graph', 'edge1'); SELECT create_vlabel('test_graph', 'vertice1'); SELECT create_vlabel('test_graph', 'vertice2'); SELECT create_elabel('test_graph', 'edge1'); SELECT * FROM cypher( 'test_graph' , $$ MATCH (a:vertice1) RETURN a $$ ) AS (a agtype);

SELECT * FROM cypher( 'test_graph' , $$ MATCH (a:vertice2) RETURN a $$ ) AS (a agtype);

SELECT * FROM cypher( 'test_graph' , $$ CREATE (a:vertice1 {vertice_name:'first'}) RETURN a $$ ) AS (a agtype);

SELECT * FROM cypher( 'test_graph' , $$ OPTIONAL MATCH (a:vertice1), (b:vertice2) RETURN a, b $$ ) AS (a agtype, b agtype);

How are you accessing AGE (Command line, driver, etc.)?

  • PGAdmin4

What data setup do we need to do?


**What is the necessary configuration info needed?**
- Postgres version 14 Age release version 1.5.0

**What is the command that caused the error?**
```pgsql
SELECT * FROM cypher(
	'test_graph'
	, $$
	OPTIONAL MATCH (a:vertice1), (b:vertice2)
	RETURN a, b
$$
) AS (a agtype, b agtype);
ERROR:  no rows produced

Expected behavior value produced for vertice1, NULL produced for vertice2

Environment (please complete the following information):

  • Version: 1.5.0

Additional context Clarification using brief search on the web: image

Please note, I do not consider this source of information to be authoritative, but I have a hunch that in this case it's close enough to main to call it out. Feel free to find clarification in anything that looks more like a standard that would render me wrong.

MironAtHome avatar Feb 27 '24 21:02 MironAtHome

Reproduced on both, Windows and Linux. Workaround, produces left outer join:

SELECT * FROM cypher(
'test_graph'
, $$
MATCH (a:vertice1)
OPTIONAL MATCH (b:vertice2)
RETURN a, b
$$
) AS (a agtype, b agtype);

MironAtHome avatar Feb 29 '24 07:02 MironAtHome

@MironAtHome I think your misunderstanding with OPTIONAL MATCH is with how MATCH works with patterns. OPTIONAL MATCH and MATCH process patterns in the same way. Using your above graph of one vertex, if you execute the following command, you will get no matches -

psql-16.1-5432-pgsql=# SELECT * FROM cypher('test_graph', $$ MATCH (a:vertice1), (b:vertice2) RETURN a, b $$) AS (a agtype, b agtype);
 a | b
---+---
(0 rows)

psql-16.1-5432-pgsql=#

Note that MATCH returned nothing with the pattern (a:vertice1), (b:vertice2). This is expected because both patterns do not exist, only one does. That is why OPTIONAL MATCH returned nothing.

jrgemignani avatar Feb 29 '24 20:02 jrgemignani

@MironAtHome If this resolves your question, please consider closing the ticket :)

jrgemignani avatar Feb 29 '24 20:02 jrgemignani

I think it is not really question of match, but a question of what "optional" means. Based on what I have gathered from the documentation ( provided in the comments ) optional indicates outer kind of relationship. Which means that in case of an empty set(s) on one of the sides, it is treated as an null, whilst the other one, depending on the operator, suppose to either furnish an matching row, or no row. So, if one is to read it with an eye to standard sql, this falls into the case of full outer join. It would be very nice if there is a chance to clarify similar against some alternate engine? I didn't treat it with high severity, since performing hierarchical match, where 1st level is required and 2nd level is optional, query works correctly. So, we can do left outer join right now, and that gives us a chance with, perhaps one additional query step, satisfy need to perform outer join. However, if query could pick up proper outer join behavior, it would be helpful, particularly for some of the corner cases used to test data quality. Hope this clarification helps.

MironAtHome avatar Mar 04 '24 16:03 MironAtHome

Used Neo4J sandbox to verify my report above. Looks like Neo4J behaves identically to Age. Closing this bug / question.

MironAtHome avatar Mar 23 '24 11:03 MironAtHome