gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

Consider MPP FDW LIMIT pushdown when both offset and limit clause are specified

Open jingwen-yang-yjw opened this issue 1 year ago • 3 comments

What this PR does

Before this PR, GPDB won't push OFFSET/LIMIT in the remote servers for MPP FDW when both offset and limit clause are specified. Because it's unsafe if we don't adjust Remote SQL.

In this PR, we consider MPP FDW OFFSET/LIMIT pushdown when both OFFSET and LIMIT clause are specified. We will adjust Remote SQL when mpp_execute = 'all segments' for query including OFFSET/LIMIT clause.

Implementation Details

There are 3 cases for query including OFFSET/LIMIT clause.

Case 1: only LIMIT is specified Two-phase LIMIT plan is enough and there is NO need to adjust Remote SQL. This case had been supported. (Look at PR https://github.com/greenplum-db/gpdb/pull/16177 for more information.)

Case 2: both LIMIT and OFFSET are specified We also need to adjust Remote SQL to let GPDB fetch tuples from 0 to limitOffset + limitCount from remote servers.

-- Simple query with OFFSET and LIMIT clause together is pushed down.
EXPLAIN (VERBOSE, COSTS OFF)
SELECT c1, c2 FROM mpp_ft2 order by c1 offset 2 limit 3;
                                          QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 Limit
   Output: c1, c2
   ->  Gather Motion 2:1  (slice1; segments: 2)
         Output: c1, c2
         Merge Key: c1
         ->  Foreign Scan on public.mpp_ft2
               Output: c1, c2
               Remote SQL: SELECT c1, c2 FROM "MPP_S 1"."T 2" ORDER BY c1 ASC NULLS LAST LIMIT (2::bigint + 3::bigint)
 Optimizer: Postgres-based planner

Case 3: only OFFSET is specified In this case, we also need to adjust Remote SQL to let GPDB fetch all tuples of foreign table from remote servers.

-- Simple query with only OFFSET clause is NOT pushed down.
EXPLAIN (VERBOSE, COSTS OFF)
SELECT c1, c2 FROM mpp_ft2 order by c1 offset 998;
                                       QUERY PLAN                                        
-----------------------------------------------------------------------------------------
 Limit
   Output: c1, c2
   ->  Gather Motion 2:1  (slice1; segments: 2)
         Output: c1, c2
         Merge Key: c1
         ->  Foreign Scan on public.mpp_ft2
               Output: c1, c2
               Remote SQL: SELECT c1, c2 FROM "MPP_S 1"."T 2" ORDER BY c1 ASC NULLS LAST
 Optimizer: Postgres-based planner

Pipeline: https://dev.ci.gpdb.pivotal.io/teams/main/pipelines/gpdb-dev-improve_mpp_offset_limit_pushdown-rocky8?group=all

jingwen-yang-yjw avatar Jan 03 '24 07:01 jingwen-yang-yjw