age icon indicating copy to clipboard operation
age copied to clipboard

Complex `MERGE` causes crash

Open DominicWuest opened this issue 1 year ago • 9 comments

I found a bug using my cypher fuzzer.

When running the following query against an empty database:

MERGE n0 = (:A)-[:B{n1:0}]->(x:C)-[:D]->(x)-[:E]->(x)<-[:F{n2:0}]-(:G{n3:0})-[:H{n4:0}]->(y:I{n5:0})<-[:J]-(y)<-[:K]-(:L{n6:0})

The postgres instance crashes and goes into recovery mode.

I encountered this issue when testing queries against the apache/age:PG13_latest docker image.

Steps to reproduce

Spin up a local instance of apache/age:PG13_latest: docker run -e POSTGRES_PASSWORD=123 --rm --name age apache/age:PG13_latest

Get a shell in the docker container: docker exec -it age /bin/bash

Connect to postgres: su postgres -c psql

Run the following queries:

LOAD 'age';
-
SET search_path = ag_catalog, "$user", public;
-
SELECT create_graph('graph');
-
SELECT * FROM cypher('graph',$$
    MERGE n0 = (:A)-[:B{n1:0}]->(x:C)-[:D]->(x)-[:E]->(x)<-[:F{n2:0}]-(:G{n3:0})-[:H{n4:0}]->(y:I{n5:0})<-[:J]-(y)<-[:K]-(:L{n6:0})
$$) as (v agtype);

Expected behavior

The query should run successfully

Actual behavior

The database crashes

DominicWuest avatar May 05 '23 09:05 DominicWuest

gdb stacktrace:

#0  0x00007f158d9c2d16 in epoll_wait (epfd=4, events=0x5576629cb6b0, 
    maxevents=1, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x00005576618ce67c in WaitEventSetWait ()
#2  0x00005576617cae30 in secure_read ()
#3  0x00005576617d2947 in ?? ()
#4  0x00005576617d37e5 in pq_getbyte ()
#5  0x00005576618f1579 in PostgresMain ()
#6  0x00005576618794a5 in ?? ()
#7  0x000055766187a3f4 in PostmasterMain ()
#8  0x00005576615fc046 in main ()

DominicWuest avatar May 05 '23 14:05 DominicWuest

@DominicWuest This crash is resolved in the master and is due to variable reuse for labels. Currently, multiple labels are not supported. Additionally, we are still working on allowing variable reuse with labels.

psql-11.5-5432-pgsql=# SELECT * FROM cypher('graph2',$$ MERGE n0 = (:A)-[:B{n1:0}]->(x:C)-[:D]->(x)-[:E]->(x)<-[:F{n2:0}]-(:G{n3:0})-[:H{n4:0}]->(y:I{n5:0})<-[:J]-(y)<-[:K]-(:L{n6:0}) $$) as (v agtype); ERROR: multiple labels for variable 'x' are not supported LINE 2: MERGE n0 = (:A)-[:B{n1:0}]->(x:C)-[:D]->(x)-[:E]->(x)<-[... ^ psql-11.5-5432-pgsql=#

jrgemignani avatar May 05 '23 18:05 jrgemignani

@DominicWuest Actually, I take that back. I was able to get it to crash.

jrgemignani avatar May 05 '23 20:05 jrgemignani

@jrgemignani @DominicWuest I am currently on the latest commit and I cannot get this to crash

Edit: I am running it on an empty database, it just throws this error:

2023-05-24 15:39:27.651 PDT [331077] ERROR: multiple labels for variable 'x' are not supported at character 74 2023-05-24 15:39:27.651 PDT [331077] STATEMENT: SELECT * FROM cypher('empty', $$ MERGE n0 = (:A)-[:B{n1:0}]->(x:C)-[:D]->(x)-[:E]->(x)<-[:F{n2:0}]-(:G{n3:0})-[:H{n4:0}]->(y:I{n5:0})<-[:J]-(y)<-[:K]-(:L{n6:0}) $$) as (a agtype); ERROR: multiple labels for variable 'x' are not supported

dehowef avatar May 24 '23 22:05 dehowef

postgres=# SELECT * FROM cypher('test', $$
       MERGE n0 = (x)-[:A]->(x)-[:B]->(x)-[:C]->(x)
 $$) AS (res agtype);
WARNING:  problem in alloc set ExecutorState: bad single-chunk 0x560be8f472b0 in block 0x560be8f3e5c0
WARNING:  problem in alloc set ExecutorState: detected write past chunk end in block 0x560be8f3e5c0, chunk 0x560be8f472b0
WARNING:  problem in alloc set ExecutorState: found inconsistent memory block 0x560be8f3e5c0
WARNING:  problem in alloc set ExecutorState: bad single-chunk 0x560be8f472b0 in block 0x560be8f3e5c0
WARNING:  problem in alloc set ExecutorState: detected write past chunk end in block 0x560be8f3e5c0, chunk 0x560be8f472b0
WARNING:  problem in alloc set ExecutorState: found inconsistent memory block 0x560be8f3e5c0
 res 
-----
(0 rows)

mohayu22 avatar May 25 '23 12:05 mohayu22

postgres=# SELECT * FROM cypher('test', $$
             MERGE n0 = (:A)-[:B{n1:0}]->(x)-[:D]->(x)-[:E]->(x)<-[:F{n2:0}]-(:G{n3:0})
  $$) AS (res agtype);

server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

mohayu22 avatar May 25 '23 12:05 mohayu22

Here is a slightly less complicated command that exhibits both errors -

psql-11.5-5432-pgsql=# SELECT * FROM cypher('test', $$ MERGE ()-[:B]->(x:C)-[:E]->(x:C)<-[f:F]-(y:I) $$) as (r agtype);
WARNING:  problem in alloc set ExecutorState: detected write past chunk end in block 0x136b500, chunk 0x1378f48
WARNING:  problem in alloc set ExecutorState: bad single-chunk 0x1378fe0 in block 0x136b500
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

It appears that this MERGE bug has to do with memory allocation. It seems likely that the successive tuple creates aren't properly allocating tuples.

jrgemignani avatar May 31 '23 19:05 jrgemignani

I have created PR #957 to address some of these crashes. It won't address all of them, but it will help to isolate the ones that remain. Basically, there are multiple paths for some of these complex MERGEs to cause a crash, this PR addresses some of those paths.

jrgemignani avatar Jun 01 '23 00:06 jrgemignani

PR #961 addresses the source of -> this <- crash. So, this should now be corrected in the master branch.

jrgemignani avatar Jun 05 '23 23:06 jrgemignani