gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

auto_explain: fix failure when executor hooks are run in QEs

Open red1452 opened this issue 2 years ago • 4 comments

The reason of failure is that auto_explain hooks are fully executed on QEs. The global variable Gp_role is initialized after auto_explain is loaded as shared preload library. The QE has valid planstate objects just for nodes in query plan corresponded to the slice of current QE. As result, for "intermediate" slices we have NULL pointer to subtree state for "child" (within slice borders) Motion nodes (subtree itself is valid and points to real node but state for this subtree is not). ExplainNode() routine called within ExplainPrintPlan() function fails trying to dereference that pointer.

To fix the failure I've changed the guard check Gp_role != GP_ROLE_DISPATCH in _PG_init() of auto_explain to macro IS_QUERY_DISPATCHER() that relies on GpIdentity.segindex value to define the kind of node on which code is executed. As a consequence, hooks of auto_explain are activated just on coordinator node. To exclude execution of hooks on coordinator on entrydb kind of worker (having gp_role=executor) I've added additional check Gp_role != GP_ROLE_DISPATCH into macro auto_explain_enabled(). This check disables auto_explain when some work (as executor) is doing at master.

Here are some reminders before you submit the pull request

  • [ ] Add tests for the change
  • [ ] Document changes
  • [ ] Communicate in the mailing list if needed
  • [ ] Pass make installcheck
  • [ ] Review a PR in return to support the community

red1452 avatar Jul 25 '22 03:07 red1452

@red1452 Please sign the Contributor License Agreement!

Click here to manually synchronize the status of this Pull Request.

See the FAQ for frequently asked questions.

pivotal-cla avatar Jul 25 '22 03:07 pivotal-cla

Problem description

The following scenario produces failed assertion:

  1. Enable auto_explain as shared_preload_library:
gpconfig -c shared_preload_libraries -v 'auto_explain';
gpconfig -c auto_explain.log_min_duration -v 0 --skipvalidation;
gpconfig -c auto_explain.log_analize -v true --skipvalidation;
gpstop -raiq;
  1. Create table with follow query and do select:
CREATE TABLE t1(i int);
CREATE TABLE t2(i int);
SELECT count(1) from (select i from t1 limit 10) t join t2 using (i);
  1. Got failed assertion:
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
connection to server was lost

Log from postmaster:

2022-07-25 08:47:28.818686 +07,"evgeniy","regression",p166797,th-1405949568,"[local]",,2022-07-25 08:47:28 +07,0,con8,cmd5,seg-1,slice2,,,sx1,"FATAL","XX000","Unexpected internal error (assert.c:44)","FailedAssertion(""!(Gp_role != GP_ROLE_EXECUTE)"", File: ""explain_gp.c"", Line: 1365)",,,,,"SELECT count(1) from (select i from t1 limit 10) t join t2 using (i);",0,,"assert.c",44,"Stack trace:
1    0x55aa250f7324 postgres errstart + 0x294
2    0x55aa250ef1aa postgres ExceptionalCondition + 0x3a
3    0x55aa24c65c15 postgres <symbol not found> + 0x24c65c15
4    0x7f11af6d9ac3 auto_explain.so <symbol not found> + 0xaf6d9ac3
5    0x55aa24f57aac postgres PortalStart + 0x45c
6    0x55aa24f501ed postgres <symbol not found> + 0x24f501ed
7    0x55aa24f54c12 postgres PostgresMain + 0x2362
8    0x55aa24e9bb91 postgres <symbol not found> + 0x24e9bb91
9    0x55aa24e9ca9c postgres PostmasterMain + 0xe1c
10   0x55aa24a4e660 postgres main + 0x510
11   0x7f11aec42fd0 libc.so.6 <symbol not found> + 0xaec42fd0
12   0x7f11aec4307d libc.so.6 __libc_start_main + 0x7d
13   0x55aa24a5dae5 postgres _start + 0x25

The reason of failure that auto_explain works at coordinator when it Gp_role is GP_ROLE_EXECUTOR. Assertion checks that Gp_role must be GP_ROLE_DISPATCH only. Also this patch fixes another failure: auto_explain hooks are fully executed on QEs. The global variable Gp_role is initialized after auto_explain is loaded as shared preload library. The QE has valid planstate objects just for nodes in query plan corresponded to the slice of current QE. As result, for "intermediate" slices we have NULL pointer to subtree state for "child" (within slice borders) Motion nodes (subtree itself is valid and points to real node but state for this subtree is not). ExplainNode() routine called within ExplainPrintPlan() function fails trying to dereference that pointer.

Patch details

I've changed the guard check Gp_role != GP_ROLE_DISPATCH in _PG_init() of auto_explain to macro IS_QUERY_DISPATCHER() that relies on GpIdentity.segindex value to define the kind of node on which code is executed. As a consequence, hooks of auto_explain are activated just on coordinator node. To exclude execution of hooks on coordinator on entrydb kind of worker (having gp_role=executor) I've added additional check Gp_role != GP_ROLE_DISPATCH into macro auto_explain_enabled(). This check disables auto_explain when some work (as executor) is doing at master.

red1452 avatar Jul 25 '22 03:07 red1452

@red1452 Thank you for signing the Contributor License Agreement!

pivotal-cla avatar Jul 25 '22 04:07 pivotal-cla

Hi, thanks for your contribution @red1452 The CI pipeline fails due to:

Planner

======================================================================
DIFF FILE: ../gpdb_src/contrib/auto_explain/regression.diffs
----------------------------------------------------------------------

diff -I HINT: -I CONTEXT: -I GP_IGNORE: -U3 /tmp/build/d62a0504/gpdb_src/contrib/auto_explain/expected/bfv_preload_auto_explain.out /tmp/build/d62a0504/gpdb_src/contrib/auto_explain/results/bfv_preload_auto_explain.out
--- /tmp/build/d62a0504/gpdb_src/contrib/auto_explain/expected/bfv_preload_auto_explain.out	2022-07-25 05:11:51.469228984 +0000
+++ /tmp/build/d62a0504/gpdb_src/contrib/auto_explain/results/bfv_preload_auto_explain.out	2022-07-25 05:11:51.469228984 +0000
@@ -31,26 +31,20 @@
 LOG:  statement: SELECT count(1) from (select i from t1 limit 10) t join t2 using (i);
 LOG:  duration: ms
 Query Text: SELECT count(1) from (select i from t1 limit 10) t join t2 using (i);
-Finalize Aggregate  
-  ->  Gather Motion 3:1  (slice1; segments: 3)  
-        ->  Partial Aggregate  
-              ->  Hash Join  
+Finalize Aggregate  (cost=475.67..475.68 rows=1 width=8)
+  ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=475.61..475.66 rows=3 width=8)
+        ->  Partial Aggregate  (cost=475.61..475.62 rows=1 width=8)
+              ->  Hash Join  (cost=35.95..474.81 rows=321 width=0)
                     Hash Cond: (t2.i = t1.i)
-                    ->  Seq Scan on t2  
-                    ->  Hash  
-                          Buckets: 524288  Batches: 1  Memory Usage: 4096kB
-                          ->  Redistribute Motion 1:3  (slice2; segments: 1)  
+                    ->  Seq Scan on t2  (cost=0.00..355.00 rows=32100 width=4)
+                    ->  Hash  (cost=35.90..35.90 rows=3 width=4)
+                          ->  Redistribute Motion 1:3  (slice2; segments: 1)  (cost=35.50..35.90 rows=3 width=4)
                                 Hash Key: t1.i
-                                ->  Limit  
-                                      ->  Gather Motion 3:1  (slice3; segments: 3)  
-                                            ->  Limit  
-                                                  ->  Seq Scan on t1  
+                                ->  Limit  (cost=35.50..35.67 rows=10 width=4)
+                                      ->  Gather Motion 3:1  (slice3; segments: 3)  (cost=35.50..36.01 rows=30 width=4)
+                                            ->  Limit  (cost=35.50..35.61 rows=10 width=4)
+                                                  ->  Seq Scan on t1  (cost=0.00..355.00 rows=32100 width=4)
 Optimizer: Postgres query optimizer
-  (slice0)    Executor Memory: 
-  (slice1)    Executor Memory: 
-  (slice2)    Executor Memory: 
-  (slice3)    Executor Memory: 
-Memory used:
  count 
 -------
      0


ORCA:

======================================================================
DIFF FILE: ../gpdb_src/contrib/auto_explain/regression.diffs
----------------------------------------------------------------------

diff -I HINT: -I CONTEXT: -I GP_IGNORE: -U3 /tmp/build/b235ddbb/gpdb_src/contrib/auto_explain/expected/bfv_preload_auto_explain_optimizer.out /tmp/build/b235ddbb/gpdb_src/contrib/auto_explain/results/bfv_preload_auto_explain.out
--- /tmp/build/b235ddbb/gpdb_src/contrib/auto_explain/expected/bfv_preload_auto_explain_optimizer.out	2022-07-25 05:22:20.674095642 +0000
+++ /tmp/build/b235ddbb/gpdb_src/contrib/auto_explain/results/bfv_preload_auto_explain.out	2022-07-25 05:22:20.674095642 +0000
@@ -32,24 +32,18 @@
 LOG:  statement: SELECT count(1) from (select i from t1 limit 10) t join t2 using (i);  (entry db)
 LOG:  duration: ms
 Query Text: SELECT count(1) from (select i from t1 limit 10) t join t2 using (i);
-Aggregate  
-  ->  Gather Motion 3:1  (slice1; segments: 3)  
-        ->  Hash Join  
+Aggregate  (cost=0.00..862.00 rows=1 width=8)
+  ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..862.00 rows=1 width=1)
+        ->  Hash Join  (cost=0.00..862.00 rows=1 width=1)
               Hash Cond: (t1.i = t2.i)
-              ->  Redistribute Motion 1:3  (slice2)  
+              ->  Redistribute Motion 1:3  (slice2)  (cost=0.00..431.00 rows=1 width=4)
                     Hash Key: t1.i
-                    ->  Limit  
-                          ->  Gather Motion 3:1  (slice3; segments: 3)  
-                                ->  Seq Scan on t1  
-              ->  Hash  
-                    Buckets: 524288  Batches: 1  Memory Usage: 4096kB
-                    ->  Seq Scan on t2  
+                    ->  Limit  (cost=0.00..431.00 rows=1 width=4)
+                          ->  Gather Motion 3:1  (slice3; segments: 3)  (cost=0.00..431.00 rows=1 width=4)
+                                ->  Seq Scan on t1  (cost=0.00..431.00 rows=1 width=4)
+              ->  Hash  (cost=431.00..431.00 rows=1 width=4)
+                    ->  Seq Scan on t2  (cost=0.00..431.00 rows=1 width=4)
 Optimizer: Pivotal Optimizer (GPORCA)
-  (slice0)    Executor Memory: 
-  (slice1)    Executor Memory: 
-  (slice2)    Executor Memory: 
-  (slice3)    Executor Memory: 
-Memory used:
  count 
 -------
      0

kainwen avatar Jul 25 '22 07:07 kainwen

Looks reasonable at first quick glance.

Will look into it this week. Thanks a lot for your contribution!

kainwen avatar Aug 16 '22 14:08 kainwen

Hi @red1452 your analysis looks reasonable. But I do not reproduce as your steps:

gpadmin@zlyu:~/workspace/gpdb/contrib/auto_explain$ make -j -s install
gpadmin@zlyu:~/workspace/gpdb/contrib/auto_explain$ gpconfig -c shared_preload_libraries -v 'auto_explain';
20220818:09:28:50:158378 gpconfig:zlyu:gpadmin-[INFO]:-completed successfully with parameters '-c shared_preload_libraries -v auto_explain'
gpadmin@zlyu:~/workspace/gpdb/contrib/auto_explain$ gpconfig -c auto_explain.log_min_duration -v 0 --skipvalidation;
20220818:09:28:51:158810 gpconfig:zlyu:gpadmin-[INFO]:-completed successfully with parameters '-c auto_explain.log_min_duration -v 0 --skipvalidation'
gpadmin@zlyu:~/workspace/gpdb/contrib/auto_explain$ gpconfig -c auto_explain.log_analize -v true --skipvalidation;
20220818:09:28:52:159227 gpconfig:zlyu:gpadmin-[INFO]:-completed successfully with parameters '-c auto_explain.log_analize -v true --skipvalidation'
gpadmin@zlyu:~/workspace/gpdb/contrib/auto_explain$ gpstop -raiq;
gpadmin@zlyu:~/workspace/gpdb/contrib/auto_explain$ psql
psql (12beta2)
Type "help" for help.

gpadmin=# select version();
                                                                                                               version

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------
 PostgreSQL 12beta2 (Greenplum Database 7.0.0-alpha.0+dev.15654.gf083d666ca6 build dev) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit compiled
 on Aug 18 2022 09:25:46 (with assert checking)
(1 row)

gpadmin=# CREATE TABLE t1(i int);
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'i' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
gpadmin=# CREATE TABLE t2(i int);
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'i' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
gpadmin=# SELECT count(1) from (select i from t1 limit 10) t join t2 using (i);
 count
-------
     0
(1 row)

gpadmin=#

Do I miss anything?

kainwen avatar Aug 18 '22 01:08 kainwen

I configure and make the current master branch with options --with-python --disable-gpcloud --enable-cassert CFLAGS="-ggdb -g3 -O0 -fno-omit-frame-pointer" --enable-debug-extensions --enable-orca --enable-depend. And do the same as description:

evgeniy@evgeniy-pc:~/gpdb/orig_greenplum/gpdb$ git status
On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   gpcontrib/gpcloud/test/googletest (new commits)

no changes added to commit (use "git add" and/or "git commit -a")
evgeniy@evgeniy-pc:~/gpdb/orig_greenplum/gpdb$ gpconfig -c shared_preload_libraries -v 'auto_explain';
20220830:14:43:59:270999 gpconfig:evgeniy-pc:evgeniy-[INFO]:-completed successfully with parameters '-c shared_preload_libraries -v auto_explain'
evgeniy@evgeniy-pc:~/gpdb/orig_greenplum/gpdb$ gpconfig -c auto_explain.log_min_duration -v 0 --skipvalidation;
20220830:14:44:08:271112 gpconfig:evgeniy-pc:evgeniy-[INFO]:-completed successfully with parameters '-c auto_explain.log_min_duration -v 0 --skipvalidation'
evgeniy@evgeniy-pc:~/gpdb/orig_greenplum/gpdb$ gpconfig -c auto_explain.log_analize -v true --skipvalidation;
20220830:14:44:16:271203 gpconfig:evgeniy-pc:evgeniy-[INFO]:-completed successfully with parameters '-c auto_explain.log_analize -v true --skipvalidation'
evgeniy@evgeniy-pc:~/gpdb/orig_greenplum/gpdb$ gpstop -raiq;
evgeniy@evgeniy-pc:~/gpdb/orig_greenplum/gpdb$ psql postgres 
psql (12beta2)
Type "help" for help.

postgres=# select version();
                                                                                                            version                                                                                                            
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 12beta2 (Greenplum Database 7.0.0-alpha.0+dev.15694.g8ac6ab9eac build dev) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0, 64-bit compiled on Aug 30 2022 12:54:42 (with assert checking)
(1 row)

postgres=# CREATE TABLE t1(i int);
CREATE TABLE t2(i int);
SELECT count(1) from (select i from t1 limit 10) t join t2 using (i);
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'i' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'i' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
ERROR:  bogus varno: 65001 (ruleutils.c:6961)  (entry db 127.0.1.1:7000 pid=271713) (ruleutils.c:6961)
postgres=#

With my fixes problem is not appeared.

red1452 avatar Aug 30 '22 07:08 red1452

I configure and make the current master branch with options --with-python --disable-gpcloud --enable-cassert CFLAGS="-ggdb -g3 -O0 -fno-omit-frame-pointer" --enable-debug-extensions --enable-orca --enable-depend. And do the same as description:

evgeniy@evgeniy-pc:~/gpdb/orig_greenplum/gpdb$ git status
On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   gpcontrib/gpcloud/test/googletest (new commits)

no changes added to commit (use "git add" and/or "git commit -a")
evgeniy@evgeniy-pc:~/gpdb/orig_greenplum/gpdb$ gpconfig -c shared_preload_libraries -v 'auto_explain';
20220830:14:43:59:270999 gpconfig:evgeniy-pc:evgeniy-[INFO]:-completed successfully with parameters '-c shared_preload_libraries -v auto_explain'
evgeniy@evgeniy-pc:~/gpdb/orig_greenplum/gpdb$ gpconfig -c auto_explain.log_min_duration -v 0 --skipvalidation;
20220830:14:44:08:271112 gpconfig:evgeniy-pc:evgeniy-[INFO]:-completed successfully with parameters '-c auto_explain.log_min_duration -v 0 --skipvalidation'
evgeniy@evgeniy-pc:~/gpdb/orig_greenplum/gpdb$ gpconfig -c auto_explain.log_analize -v true --skipvalidation;
20220830:14:44:16:271203 gpconfig:evgeniy-pc:evgeniy-[INFO]:-completed successfully with parameters '-c auto_explain.log_analize -v true --skipvalidation'
evgeniy@evgeniy-pc:~/gpdb/orig_greenplum/gpdb$ gpstop -raiq;
evgeniy@evgeniy-pc:~/gpdb/orig_greenplum/gpdb$ psql postgres 
psql (12beta2)
Type "help" for help.

postgres=# select version();
                                                                                                            version                                                                                                            
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 12beta2 (Greenplum Database 7.0.0-alpha.0+dev.15694.g8ac6ab9eac build dev) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0, 64-bit compiled on Aug 30 2022 12:54:42 (with assert checking)
(1 row)

postgres=# CREATE TABLE t1(i int);
CREATE TABLE t2(i int);
SELECT count(1) from (select i from t1 limit 10) t join t2 using (i);
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'i' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'i' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
ERROR:  bogus varno: 65001 (ruleutils.c:6961)  (entry db 127.0.1.1:7000 pid=271713) (ruleutils.c:6961)
postgres=#

With my fixes problem is not appeared.

Got it. Will take a look at this.

kainwen avatar Sep 06 '22 07:09 kainwen

Hi, the new issue is due to entrydb QE, using planner, the reproduce in this PR will not schedule singleQE to entry, ORCA will.

This PR is to gpdb master branch, master branch, it will first set the global var Gp_role during postmaster start, parsing command lines, and then to process_shared_preload_libraries it will call auto_explain's _PG_init.

So for this PR, I think we only need the macro change. But it is still OK to change code like this because share libs are only loaded during startup.

Why this error, I've not looked into yet, guess is due to not full execution of the plan in entrydb.

~For 6X, I do reproduce a crash, not look into.~

6X crash is due to the Gp_role management issue, even segment postmaster is startup as dispatch, and QEs will fetch the connection params from QD and then change it to QE. Thus on each segment this lib is also loaded.

So for 6X this PR's modification is OK.

kainwen avatar Sep 06 '22 14:09 kainwen

My two cents for this PR:

  1. do not change code in _PG_init
  2. reword the commit message

then we can push this.

Thanks a lot for your contribution! @red1452

kainwen avatar Sep 07 '22 02:09 kainwen

Pushed, thanks a lot for your contribution.

kainwen avatar Sep 08 '22 01:09 kainwen