age icon indicating copy to clipboard operation
age copied to clipboard

Implement multiple label

Open rafsun42 opened this issue 1 year ago • 5 comments

Add support for multiple labels in CREATE, MERGE and MATCH clause.

rafsun42 avatar Dec 14 '23 21:12 rafsun42

hello :-)

i have tried this while looking for a way to use OR (|) operator in Cypher queries, as i mentionned here.

but i couldn't compile (probably because i misunderstand something). here is the output of make after i have gh pr checkout 1452 from inside AGE directory:

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-format-truncation -Wno-stringop-truncation -g -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -fno-omit-frame-pointer -fPIC -I.//src/include -I.//src/include/parser -I. -I./ -I/usr/include/postgresql/15/server -I/usr/include/postgresql/internal  -Wdate-time -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2   -c -o src/backend/parser/cypher_label_expr.o src/backend/parser/cypher_label_expr.c
src/backend/parser/cypher_label_expr.c: In function ‘get_label_expr_relations’:
src/backend/parser/cypher_label_expr.c:263:9: error: a label can only be part of a statement and a declaration is not a statement
  263 |         List *reloids;
      |         ^~~~
src/backend/parser/cypher_label_expr.c:264:9: error: expected expression before ‘List’
  264 |         List *(*merge_lists)(List *, const List *);
      |         ^~~~
src/backend/parser/cypher_label_expr.c:265:9: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
  265 |         ListCell *lc;
      |         ^~~~~~~~
src/backend/parser/cypher_label_expr.c:281:9: error: ‘merge_lists’ undeclared (first use in this function)
  281 |         merge_lists = label_expr_type == LABEL_EXPR_TYPE_AND ?
      |         ^~~~~~~~~~~
src/backend/parser/cypher_label_expr.c:281:9: note: each undeclared identifier is reported only once for each function it appears in
src/backend/parser/cypher_label_expr.c:321:27: warning: implicit declaration of function ‘merge_lists’ [-Wimplicit-function-declaration]
  321 |                 reloids = merge_lists(reloids, lcd->allrelations);
      |                           ^~~~~~~~~~~
src/backend/parser/cypher_label_expr.c:321:25: warning: assignment to ‘List *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
  321 |                 reloids = merge_lists(reloids, lcd->allrelations);
      |                         ^
src/backend/parser/cypher_label_expr.c: In function ‘label_expr_relname’:
src/backend/parser/cypher_label_expr.c:385:9: error: a label can only be part of a statement and a declaration is not a statement
  385 |         StringInfo relname_strinfo;
      |         ^~~~~~~~~~
src/backend/parser/cypher_label_expr.c:386:9: error: expected expression before ‘ListCell’
  386 |         ListCell *lc;
      |         ^~~~~~~~
src/backend/parser/cypher_label_expr.c:387:9: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
  387 |         char *relname;
      |         ^~~~
In file included from /usr/include/postgresql/15/server/access/tupdesc.h:19,
                 from /usr/include/postgresql/15/server/utils/relcache.h:17,
                 from /usr/include/postgresql/15/server/access/genam.h:21,
                 from src/backend/parser/cypher_label_expr.c:3:
src/backend/parser/cypher_label_expr.c:394:18: error: ‘lc’ undeclared (first use in this function)
  394 |         foreach (lc, label_expr->label_names)
      |                  ^~
/usr/include/postgresql/15/server/nodes/pg_list.h:356:5: note: in definition of macro ‘foreach’
  356 |    (cell = &cell##__state.l->elements[cell##__state.i], true) : \
      |     ^~~~
/usr/include/postgresql/15/server/nodes/pg_list.h:356:55: warning: left-hand operand of comma expression has no effect [-Wunused-value]
  356 |    (cell = &cell##__state.l->elements[cell##__state.i], true) : \
      |                                                       ^
src/backend/parser/cypher_label_expr.c:394:9: note: in expansion of macro ‘foreach’
  394 |         foreach (lc, label_expr->label_names)
      |         ^~~~~~~
/usr/include/postgresql/15/server/nodes/pg_list.h:357:16: warning: left-hand operand of comma expression has no effect [-Wunused-value]
  357 |    (cell = NULL, false); \
      |                ^
src/backend/parser/cypher_label_expr.c:394:9: note: in expansion of macro ‘foreach’
  394 |         foreach (lc, label_expr->label_names)
      |         ^~~~~~~
make: *** [<commande interne> : src/backend/parser/cypher_label_expr.o] Erreur 1

here is the output of psql --version:

psql (PostgreSQL) 15.5 (Debian 15.5-1.pgdg110+1)

i am running postgresql 15 on Debian 11 bullseye, installed with these debian apt packages:

postgresql-client-15 
postgresql-15 
postgresql-server-dev-15 
libpq-dev

i have tried on these PG15 branchs of AGE (for both, in META.JSON file, the version is said to be 1.3.0):

git clone --branch PG15 https://github.com/apache/age
git clone --branch PG15/v1.5.0-rc0 https://github.com/apache/age .

it's probably meaningless but to make AGE compile, i had to install two deps from debian apt packages: bison and flex.

thjbdvlt avatar Feb 08 '24 12:02 thjbdvlt

@thjbdvlt Did you do make clean before make install?

rafsun42 avatar Feb 08 '24 20:02 rafsun42

@rafsun42 i didn't, but with make clean, the same errors occur when i run make install, or also if i use make clean install. i would like to help more but i really dont understand all of these, (i'll read about make next days, and as i did want to learn some c, it may be a good occasion to start and try understanding these reports).

(plus, as i now have a small partition on my computer dedicated to postgresql 15/age/pr 1452, i can easily try things)

thjbdvlt avatar Feb 09 '24 08:02 thjbdvlt

@thjbdvlt After some digging, I figured out this branch indeed does not compile with a PostgreSQL copy compiled with the --with-llvm option, which is the case when PostgreSQL is installed via a package manager. This error did not occur in my development environment because I build PostgreSQL from source without that option.

I pushed a commit to address this issue. Let me know if it compiles now.

rafsun42 avatar Feb 13 '24 00:02 rafsun42

@rafsun42 it does compile! i ran some minimal tests, matching with (a)-[:elabel_b|edge_label_c]->(d) works. absolutely wonderful :)

thjbdvlt avatar Feb 14 '24 10:02 thjbdvlt

This Feature is huge, not only for openCypher compliance. A merge would be much appreciated. I would like to provide two real life examples, where this is mandatory for matching.

I am using graph databases mostly as backend for traditional OLTP Applications, as opposed to data analysis. For me the most common use case for union style relation type queries is, to enforce constraints on your relations by checking for forbidden combinations before creating an edge.

Example 1

Imagine a graph with vertices representing persons :PERSON and edges representing their relation to one another :SIBLING, :PARENT, :MARRIED, .... When trying to add a new edge between two persons, especially when triggered by user input, you would probably want to check, if that would violate the rules of your application. In this case you might consider denying the action, if (a:PERSON)-[:SIBLING|PARENT *]-(b:PERSON) exists, before marrying two people.

Example 2

Imagine a graph for task planning. Tasks :TASK can depend :DEPENDS_ON on other tasks, to form a chain of execution. But they can also be infinitely nested :SUBTASK_OF to divide them into smaller chunks of work. Before creating any of the two edge types, we need to check, that we don't create a circular dependency. In this graph model, it can happen like this:

// having
(a:TASK)-[:DEPENDS_ON]->(b:TASK)
// each of these would produce a circular dependency for the model:
CREATE (b)-[:DEPENDS_ON]->(a)
CREATE (a)-[:SUBTASK_OF]->(b)
CREATE (b)-[:SUBTASK_OF]->(a)

To add to the complexity, any combination of both edge types in any path length are a problem and can (within reasonable effort) only be detected like this:

// again not allowed to exist:
MATCH p = (a:TASK)-[:DEPENDS_ON|SUBTASK_OF *]-(b:TASK)

Almost feels like a prime example for the power of graph databases. And if you don't want to use this feature in an OLTP context, you sure want to be able to find these conditions in your dataset.

mark-win avatar Mar 16 '24 00:03 mark-win

@rafsun42 Could this be rebased with the master and applied to the master branch instead of PG15? I'm not saying not to do PG15, just get it on the master first. Thoughts?

jrgemignani avatar Apr 19 '24 22:04 jrgemignani

@rafsun42 Additionally, it states that there are conflicts?

image

jrgemignani avatar Apr 19 '24 22:04 jrgemignani

@jrgemignani Sure, John. I will rebase this to latest master.

rafsun42 avatar Apr 19 '24 22:04 rafsun42

After rebase, moved to a new PR #1785.

rafsun42 avatar Apr 24 '24 21:04 rafsun42