age icon indicating copy to clipboard operation
age copied to clipboard

Flaky test age_global_graph fails on slow machines

Open saygoodbyye opened this issue 1 year ago • 3 comments

Describe the bug Test age_global_graph sometimes fails. Mainly appears when the machine is running slowly.

How are you accessing AGE (Command line, driver, etc.)? Accessing AGE through command line.

What data setup do we need to do? Apache AGE (master branch) with PostgreSQL (REL_16_STABLE).

What is the necessary configuration info needed?

./configure CFLAGS=" -Og" --enable-tap-tests --enable-debug  --enable-cassert

What is the command that caused the error? To reproduce fail several times faster I do this Makefile patch (Only to reproduce fail faster. This fail may occur even without these changes):

diff --git a/Makefile b/Makefile
index 1224fc2a..0d51a9e0 100644
--- a/Makefile
+++ b/Makefile
@@ -85,35 +85,7 @@ SQLS := $(addsuffix .sql,$(SQLS))
 DATA_built = $(age_sql)
 
 # sorted in dependency order
-REGRESS = scan \
-          graphid \
-          agtype \
-          catalog \
-          cypher \
-          expr \
-          cypher_create \
-          cypher_match \
-          cypher_unwind \
-          cypher_set \
-          cypher_remove \
-          cypher_delete \
-          cypher_with \
-          cypher_vle \
-          cypher_union \
-          cypher_call \
-          cypher_merge \
-          cypher_subquery \
-          age_global_graph \
-          age_load \
-          index \
-          analyze \
-          graph_generation \
-          name_validation \
-          jsonb_operators \
-          list_comprehension \
-          map_projection \
-          drop
-
+REGRESS=$(shell printf "age_global_graph %.0s" `seq 100` )
 srcdir=`pwd`
 
 ag_regress_dir = $(srcdir)/regress

Then i run make installcheck:

for i in `seq 1000`;do make installcheck || break;done

And get:

<...>
ok 60        - age_global_graph                         1440 ms
not ok 61    - age_global_graph                         1126 ms
ok 62        - age_global_graph                         1181 ms
ok 63        - age_global_graph                         1037 ms
<...>

regression.diffs:

diff -U3 /home/test/work/age/regress/expected/age_global_graph.out /home/test/work/age/regress/results/age_global_graph.out
--- /home/test/work/age/regress/expected/age_global_graph.out   2024-05-14 13:18:45.498945171 +0700
+++ /home/test/work/age/regress/results/age_global_graph.out    2024-05-14 23:47:05.372383890 +0700
@@ -81,7 +81,7 @@
 SELECT * FROM cypher('ag_graph_2', $$ RETURN delete_global_graphs('ag_graph_2') $$) AS (result agtype);
  result
 --------
- true
+ false
 (1 row)

 -- delete ag_graph_1's context
@@ -97,7 +97,7 @@
 SELECT * FROM cypher('ag_graph_3', $$ RETURN delete_global_graphs('ag_graph_3') $$) AS (result agtype);
  result
 --------
- true
+ false
 (1 row)

 -- delete all graphs' context again

Best regards, Egor Chindyaskin Postgres Professional: https://postgrespro.com/

saygoodbyye avatar May 15 '24 07:05 saygoodbyye

@saygoodbyye These are not necessarily errors. The different value returned mainly means it was already deleted. It's like delete erroring out if something already deleted a file it was going to delete.

jrgemignani avatar May 15 '24 15:05 jrgemignani

@saygoodbyye I have applied a fix to the master branch. It should resolve the issue #1881 Please try it out.

jrgemignani avatar May 17 '24 22:05 jrgemignani

@saygoodbyye All versions now have this update. Can you let us know if it fixes the issue?

jrgemignani avatar May 22 '24 20:05 jrgemignani

@jrgemignani, Thank you! The fix worked well! Closing this issue, but there is one more similar case #1899

saygoodbyye avatar May 25 '24 10:05 saygoodbyye

@jrgemignani, Hello, in addition to this thread, I would like to highlight the problem in our test jobs on versions 14 and 15 of Postgres. The test still fails on these versions, and I would like to know if it would be difficult to backport the patch from master to other branches?

saygoodbyye avatar Sep 12 '24 05:09 saygoodbyye

@saygoodbyye It is in PG 15 & 14. However, you would need to use the Docker dev_snapshot_<PG15/PG14> or build the latest branches to see it, as there isn't a current release with the fix.

jrgemignani avatar Sep 18 '24 18:09 jrgemignani