age
age copied to clipboard
Several server crashes when running tests in parallel
Describe the bug Several server crashes when running tests in the way below.
How are you accessing AGE (Command line, driver, etc.)? Accessing AGE through command line.
What data setup do we need to do? Apache AGE (PG16 branch) with PostgreSQL (REL_16_STABLE).
What is the necessary configuration info needed? First build:
./configure CFLAGS=" -Og" --enable-tap-tests --enable-debug --enable-cassert
Second build:
./configure CFLAGS=" -Og" --enable-tap-tests --enable-debug
I was able to crash a server by doing the following: Makefile:
diff --git a/Makefile b/Makefile
index b405ff6..549e05b 100644
--- a/Makefile
+++ b/Makefile
@@ -85,31 +85,7 @@ SQLS := $(addsuffix .sql,$(SQLS))
DATA_built = $(age_sql)
# sorted in dependency order
-REGRESS = scan \
- graphid \
- agtype \
- catalog \
- cypher \
- expr \
- cypher_create \
- cypher_match \
- cypher_unwind \
- cypher_set \
- cypher_remove \
- cypher_delete \
- cypher_with \
- cypher_vle \
- cypher_union \
- cypher_call \
- cypher_merge \
- age_global_graph \
- age_load \
- index \
- analyze \
- graph_generation \
- name_validation \
- jsonb_operators \
- drop
+REGRESS=--schedule=schedule
srcdir=`pwd`
schedule:
test: cypher_match cypher_match cypher_match cypher_match cypher_match cypher_match
Start tests:
for i in `seq 100000`;do echo "ITER $i";make -s installcheck;if coredumpctl;then break;fi; done
Results:
ITER 75
# +++ regress install-check in +++
# using temp instance on port 61958 with PID 39227
# parallel group (6 tests): cypher_match cypher_match cypher_match cypher_match cypher_match cypher_match
not ok 1 + cypher_match 1081 ms
# (test process exited with exit code 2)
not ok 2 + cypher_match 1035 ms
# (test process exited with exit code 2)
not ok 3 + cypher_match 1061 ms
# (test process exited with exit code 2)
not ok 4 + cypher_match 1034 ms
# (test process exited with exit code 2)
not ok 5 + cypher_match 1036 ms
# (test process exited with exit code 2)
not ok 6 + cypher_match 1035 ms
# (test process exited with exit code 2)
1..6
# 6 of 6 tests failed.
# The differences that caused some tests to fail can be viewed in the file "/home/egor/work/subtree/age/regress/regression.diffs".
# A copy of the test summary that you see above is saved in the file "/home/egor/work/subtree/age/regress/regression.out".
There were many attempts to run tests in this way, so during the launch process I was able to get several different crashes, presented below: Backtrace of the crash on the build with --enable-cassert: bt1_cassert.txt Bracktrace of the first crash on the build without --enable-cassert: bt2.txt Bracktrace of the second crash on the build without --enable-cassert: bt3.txt
Expected behavior Expected ERROR to be shown or sql query to be succesfully executed
Best regards, Egor Chindyaskin Postgres Professional: http://postgrespro.com/
@saygoodbyye We do not support user modified Makefiles. The Makefile is only for installing AGE and is only intended to test what was just installed.
If you want to run those specific checks in parallel to highlight an issue, please write a separate script to do so.
It seems that the server crashes occur when running tests using a specific Makefile setup and iterating through the tests multiple times. Here's a summary of the issue and the steps leading to the server crashes:
Bug Description:
Several server crashes occur when running tests in a loop with the specified Makefile setup. Steps to Reproduce:
Modify the Makefile to run tests in a loop multiple times. Start tests using the modified Makefile. Iterate through the tests multiple times, possibly hundreds of iterations. Observe server crashes occurring during the test iterations. Expected Behavior:
Tests should execute successfully without causing server crashes even when run multiple times in a loop. Environment:
Apache AGE (PG16 branch) with PostgreSQL (REL_16_STABLE). Makefile Modification:
REGRESS=--schedule=schedule Modified Test Execution:
for i in seq 100000; do
echo "ITER $i"
make -s installcheck
if coredumpctl; then
break
fi
done
Actual Outcome:
Server crashes occur during the test iterations, leading to failed tests and potential instability. Workaround:
Since the issue seems related to running tests repeatedly in a loop, you might consider running the tests individually or in smaller batches to mitigate the likelihood of server crashes until the root cause of the crashes can be identified and resolved. Additionally, analyzing the server logs and core dumps generated during the crashes could provide insights into the underlying cause.
@jrgemignani, I was able to reproduce this bug in a different way using pgreplay utility. To reproduce bug follow steps:
*** configure and install pgreplay ***
*** then replay attached postmaster.log file ***
./pgreplay -j postmaster.log
This issue is stale because it has been open 60 days with no activity. Remove "Abondoned" label or comment or this will be closed in 14 days.
This issue is stale because it has been open 60 days with no activity. Remove "Abondoned" label or comment or this will be closed in 14 days.
This issue was closed because it has been stalled for further 14 days with no activity.