pg_hint_plan Regression with Parallel hints combined with removal of indexes in Index hints

A user has reported a regression in the latest release series regarding subject, related to commit 684986acc3b3156aaf22c1571cd44dd34059a346 (applies to all stable branches). Imagine two simple tables:

CREATE TABLE bookings (
    book_ref varchar NOT NULL,
    book_date timestamp with time zone NOT NULL,
    total_amount numeric NOT NULL 
); 
CREATE INDEX ON bookings(total_amount);
CREATE UNIQUE INDEX ON bookings(book_ref) INCLUDE (book_date);
ALTER TABLE bookings ADD CONSTRAINT bookings_pkey PRIMARY KEY
    USING INDEX bookings_book_ref_book_date_idx;
ANALYZE bookings;
                                                                                                                                                                           
CREATE TABLE bookings_tmp
    AS SELECT * FROM bookings ORDER BY book_ref;
ALTER TABLE bookings_tmp ADD PRIMARY KEY(book_ref);
ANALYZE bookings_tmp;

Now here is an exotic case:

/*+
    IndexScan(a bookings_total_amount_idx)
    IndexScan(b bookings_tmp_not_exist)
    Parallel(a 4 hard)
    Parallel(b 2 hard)
*/                                                                                                                                                                                                           
EXPLAIN(costs false) SELECT * FROM bookings a,bookings_tmp b
  WHERE a.book_ref = b.book_ref and a.total_amount > 28000.00;

Under 1.4.1, we are getting this plan:

                                  QUERY PLAN                                   
-------------------------------------------------------------------------------
 Nested Loop
   Join Filter: (a.book_ref = b.book_ref)
   ->  Gather
         Workers Planned: 4
         ->  Parallel Index Scan using bookings_total_amount_idx on bookings a
               Index Cond: (total_amount > 28000.00)
   ->  Gather
         Workers Planned: 2
         ->  Parallel Seq Scan on bookings_tmp b
(9 rows)

I think that this is not correct, for two reasons:

IndexScan(b bookings_tmp_not_exist) is listed as used, but it does not make sense because it lists an index that does not exist.
Why is the sequential scan enforced under a gather? Shouldn't this use an Index Scan under the gather based on the pkey of bookings_tmp? I get that both behavior are debatable, but this looks strange for me as historical behavior.

Now, under 1.4.2, we get this plan:

                                  QUERY PLAN                                   
-------------------------------------------------------------------------------
 Nested Loop
   Join Filter: (a.book_ref = b.book_ref)
   ->  Gather
         Workers Planned: 4
         ->  Parallel Index Scan using bookings_total_amount_idx on bookings a
               Index Cond: (total_amount > 28000.00)
   ->  Seq Scan on bookings_tmp b
(7 rows)

This is better in the fact that IndexScan(b bookings_tmp_not_exist) is marked as unused. Still, this is worse in the fact that the Parallel hint with the number of workers gets entirely ignored, leading to a regression.

I've been considering a couple of options about what to tweak in the case where an index restriction is applied, but my mind comes down to a few conclusions:

For now, I would like to choose a revert of 684986acc3b3156aaf22c1571cd44dd34059a346 & friends on stable branches, getting back to the previous-still-somewhat-broken behavior.
Add a few more regression tests based on what I saw reported back to me here. I'm OK to send a patch about that.
@samimseih has mentioned me a potential solution to enforce only a subset of the ENABLE masks for the GUCs controlling if some scans should be used or not, rather than bypass the whole. This slightly impacts the existing plans, so perhaps a revert on HEAD of the original patch makes sense anyway. Or folks here are OK to let the new behavior on HEAD? I mean, it correctly discards indexes, so removing what looks like an improvement in hint detection would be sad, and it has the merit of letting the change brew longer with the release of this project for 17~.

Nov 06 '23 23:11 michaelpq

Thanks for the report @michaelpq

After we spoke offline, I spent time looking further into this. What I observed is the follows:

skipping over setup_scan_method_enforcement https://github.com/ossc-db/pg_hint_plan/blob/master/pg_hint_plan.c#L3955-L3956 does not prevent the setting up of parallel enforcement https://github.com/ossc-db/pg_hint_plan/blob/master/pg_hint_plan.c#L3986 I confirmed with some additional debugging statements.
Also, Testing the repro case on 1.4.1, I discovered some very inconsistent/odd parallel enforcement behavior that 684986a exposes in 1.4.2, but I do not think is the cause of. See the cases below.

postgres=# \dx
                    List of installed extensions
     Name     | Version |   Schema   |         Description          
--------------+---------+------------+------------------------------
 pg_hint_plan | 1.4.1   | hint_plan  | 
 plpgsql      | 1.0     | pg_catalog | PL/pgSQL procedural language
(2 rows)

set parallel_setup_cost=0;
set parallel_tuple_cost=0;
set min_parallel_index_scan_size to 0;
set min_parallel_table_scan_size=0;                                                                               
set max_parallel_workers_per_gather To 4;

I have a repro of the test case mentioned in the thread.

postgres=# /*+                            
    IndexScan(a bookings_total_amount_idx)
    IndexScan(b bookings_tmp_not_exist)
    Parallel(a 4 hard)
    Parallel(b 2 hard)
*/                                                                                                                                                                                                           
EXPLAIN(costs false) SELECT * FROM bookings a,bookings_tmp b
  WHERE a.book_ref = b.book_ref and a.total_amount > 28000.00;
                                  QUERY PLAN                                   
-------------------------------------------------------------------------------
 Nested Loop
   Join Filter: ((a.book_ref)::text = (b.book_ref)::text)
   ->  Gather
         Workers Planned: 4
         ->  Parallel Index Scan using bookings_total_amount_idx on bookings a
               Index Cond: (total_amount > 28000.00)
   ->  Gather
         Workers Planned: 2
         ->  Parallel Seq Scan on bookings_tmp b
(9 rows)

Here is where things get interesting. If only the Parallel hints are supplied, no parallel hint is enforced.

postgres=# /*+                            
    Parallel(a 4 hard)
    Parallel(b 2 hard)
*/

EXPLAIN(costs false) SELECT * FROM bookings a,bookings_tmp b
  WHERE a.book_ref = b.book_ref and a.total_amount > 28000.00;

                        QUERY PLAN                        
----------------------------------------------------------
 Nested Loop
   Join Filter: ((a.book_ref)::text = (b.book_ref)::text)
   ->  Seq Scan on bookings a
         Filter: (total_amount > 28000.00)
   ->  Seq Scan on bookings_tmp b
(5 rows)

I can force the parallel plan by keeping only index scans enabled. But, notice it's not a Parallel Seq scan on "b", but rather a Parallel Index Scan.

set enable_mergejoin = off;
set enable_seqscan = off;
set enable_hashjoin = off;
SET
SET
SET
postgres=# /*+                            
    Parallel(a 6 hard)
    Parallel(b 2 hard)
*/                                                                                                                                                                                                           
EXPLAIN(costs false) SELECT * FROM bookings a,bookings_tmp b
  WHERE a.book_ref = b.book_ref and a.total_amount > 28000.00;
                                QUERY PLAN                                 
---------------------------------------------------------------------------
 Nested Loop
   Join Filter: ((a.book_ref)::text = (b.book_ref)::text)
   ->  Gather
         Workers Planned: 6
         ->  Parallel Index Scan using bookings_pkey on bookings a
               Filter: (total_amount > 28000.00)
   ->  Gather
         Workers Planned: 2
         ->  Parallel Index Scan using bookings_tmp_pkey on bookings_tmp b
(9 rows)

It is clear that parallel enforcement, including the "hard" option, is not showing the expected results even in 1.4.1

thoughts?

Regards,

Sami

Nov 07 '23 20:11 samimseih

For now, I would like to choose a revert of https://github.com/ossc-db/pg_hint_plan/commit/684986acc3b3156aaf22c1571cd44dd34059a346 & friends on stable branches, getting back to the previous-still-somewhat-broken behavior.

With the above said, I do agree to revert the patch on stable branches, since the existing behavior, while not great, is what the users expect.

Nov 07 '23 22:11 samimseih

It is clear that parallel enforcement, including the "hard" option, is not showing the expected results even in 1.4.1.

Yeah. The fact that we would completely discard the Parallel hints when mixing more than one table sounds strange to me.

At this stage, I think that we'd better document all that in the shape of tests where one could look at a regression.diffs to see the difference generated when tweaking the code, or development is not going to scale well. I'm OK if we add comments like "strangely, this ignores both Parallel hints, perhaps these should be pushed under two Gather nodes". Or stuff like that.

I'll go put the stable branches in their previous stable with a revert, as I doubt that we've appreciated all the plan manipulations possible yet. Changing that in a major release is less or an issue to me. At this stage I'd rather prioritize plan stability to not impact users across these minor updated.

Nov 07 '23 22:11 michaelpq

I agree. I'll submit a patch for Parallel Hint tests ( including those mentioned above ).

Nov 07 '23 23:11 samimseih

Attached are tests that cover several of the gaps discovered in this discussion:

1/ Ensure that Parallel enforcement does not break when a non-existent index is used in an IndexScan hint. 2/ Demonstrate that Parallel hints cannot be enforced on empty tables.

0001-v1-Add-additional-parallel-hint-regression-tests.patch

Nov 10 '23 04:11 samimseih

Thanks for the patch.

The patch you are proposing for the additional tests has no explanation about what's being done, like the interactions between the non-existing indexes and the hints. It may be good to document now with some big XXX that some of these plans are weird because of YYY.

I am not sure that it is a good idea to use and empty the existing tables t5 and t6 for these new tests. Perhaps splitting that into a new file would make more sense?

Nov 15 '23 07:11 michaelpq

The patch you are proposing for the additional tests has no explanation about what's being done,

I hesitated doing that as it's not the norm to add much documentation to the tests, but that is not a good reason not to. I will add a better description.

I am not sure that it is a good idea to use and empty the existing tables t5 and t6 for these new tests. Perhaps splitting that into a new file would make more sense?

I am creating new tables for these tests in the sql/ut-init.sql file. I did not think it was worth adding a new init and test files for these few tests.

+CREATE TABLE s1.t5 (LIKE s1.t1 INCLUDING ALL);
+CREATE TABLE s1.t6 (LIKE s1.t1 INCLUDING ALL);

Nov 16 '23 02:11 samimseih

After second thought, showing that the non-existing index does not prevent parallel hint enforcement is not necessary. So, the test now shows the important part. For empty tables, parallel hints can only be enforced on index scans and not sequential scans.

Attached is the revised test.

Regards,

Sami 0001-Add-Parallel-hint-tests-for-empty-tables.patch

Nov 17 '23 05:11 samimseih

pg_hint_plan pg_hint_plan copied to clipboard

Regression with Parallel hints combined with removal of indexes in Index hints

pg_hint_plan
pg_hint_plan copied to clipboard