gpdb Not grab distributed snapshot if it's direct dispatch

Currently, we will create a distributed snapshot in the function GetSnapshotData() if we are QD, and we will iterate procArray again to get the global xmin/xmax/xip.

But if the current query could be dispatched to a single segment directly, which means it's a direct dispatch, there is no need to create a distributed snapshot, the local snapshot is enough.

Aug 24 '22 05:08 SmartKeyerror

Hi, thanks for your contribution. Following are my thoughts to involve more discussion.

Consider a distributed transaction insert into to load data to each segment, say seg0, seg1, seg2, ..., seg1000. Lets call this transaction dtx1 and it now commit, due to some network or OS issue, in time order, it commits in seg0, then seg1, ..., then seg100. (NOTE: this is just make discussion easy, not impact the issue).

Then another transaction is as below:

begin;
select * from t; -- select the table which is loading data in the dtx1, due to distributed snapshot, it canne see the the effect of dtx1
select * from where a = xxx; -- suppose this directly dispatch to seg0, then based on this PR, it can see the effect of dtx1

Will the above case happen with this PR (sorry I do not dig into this)? If so, is the above correct or wrong?

Aug 24 '22 10:08 kainwen

Question (outside the scope of this PR): Do we need distributed snapshot for parse analyze phase?

Aug 25 '22 00:08 ashwinstar

Question (outside the scope of this PR): Do we need distributed snapshot for parse analyze phase?

Hi @ashwinstar

I believe the answer is not.

The only concern is during greenplum's planning stage, we will evaluate stable function, and a stable function might query a distributed table. However, to my understand, a stable function should not query a distributed table. And even a stable function can, we need to play to see if it will generate a new snaphot for the SQL.

Aug 25 '22 01:08 kainwen

I believe the answer is not.

The only concern is during greenplum's planning stage, we will evaluate stable function, and a stable function might query a distributed table. However, to my understand, a stable function should not query a distributed table. And even a stable function can, we need to play to see if it will generate a new snaphot for the SQL.

I have some different opinons, according to the offical document, a stable function is:

A STABLE function cannot modify the database and is guaranteed to return the same results given the same arguments for all rows within a single statement.

This means we can use "SELECT * FROM ...." in a stable function, and this has feature been applied to Greenplum, such as function func1_read_setint_sql_stb in the test file src/test/regress/sql/qp_functions_in_contexts_setup.sql:

CREATE TABLE bar(c int, d int);
INSERT INTO bar VALUES (1, 1), (2, 2);

CREATE FUNCTION func1_read_setint_sql_stb(x int) RETURNS setof int AS $$
DECLARE
    r int;
BEGIN
    FOR r in SELECT d FROM bar WHERE c <> $1
    LOOP
        RETURN NEXT r;
    END LOOP;
    RETURN;
END
$$ LANGUAGE plpgsql STABLE READS SQL DATA;

SELECT func1_read_setint_sql_stb(3);

explain SELECT func1_read_setint_sql_stb(3);
                   QUERY PLAN                    
-------------------------------------------------
 ProjectSet  (cost=0.00..5.27 rows=1000 width=4)
   ->  Result  (cost=0.00..0.01 rows=1 width=0)
 Optimizer: Postgres query optimizer
(3 rows)

Therefore, I think we can not remove the snapshot before the parse stage.

Aug 25 '22 02:08 SmartKeyerror

Hi, thanks for your contribution. Following are my thoughts to involve more discussion.

Consider a distributed transaction insert into to load data to each segment, say seg0, seg1, seg2, ..., seg1000. Lets call this transaction dtx1 and it now commit, due to some network or OS issue, in time order, it commits in seg0, then seg1, ..., then seg100. (NOTE: this is just make discussion easy, not impact the issue).

Then another transaction is as below:
begin;
select * from t; -- select the table which is loading data in the dtx1, due to distributed snapshot, it canne see the the effect of dtx1
select * from where a = xxx; -- suppose this directly dispatch to seg0, then based on this PR, it can see the effect of dtx1
Will the above case happen with this PR (sorry I do not dig into this)? If so, is the above correct or wrong?

Thanks for the reminder! You are right, we can not use this improvement in an explicit transaction that starts with BEGIN;, we can only apply this in the implicit transaction.

Aug 25 '22 02:08 SmartKeyerror

I believe the answer is not. The only concern is during greenplum's planning stage, we will evaluate stable function, and a stable function might query a distributed table. However, to my understand, a stable function should not query a distributed table. And even a stable function can, we need to play to see if it will generate a new snaphot for the SQL.

I have some different opinons, according to the offical document, a stable function is:

A STABLE function cannot modify the database and is guaranteed to return the same results given the same arguments for all rows within a single statement.

This means we can use "SELECT * FROM ...." in a stable function, and this has feature been applied to Greenplum, such as function func1_read_setint_sql_stb in the test file src/test/regress/sql/qp_functions_in_contexts_setup.sql:
CREATE TABLE bar(c int, d int);
INSERT INTO bar VALUES (1, 1), (2, 2);

CREATE FUNCTION func1_read_setint_sql_stb(x int) RETURNS setof int AS $$
DECLARE
    r int;
BEGIN
    FOR r in SELECT d FROM bar WHERE c <> $1
    LOOP
        RETURN NEXT r;
    END LOOP;
    RETURN;
END
$$ LANGUAGE plpgsql STABLE READS SQL DATA;

SELECT func1_read_setint_sql_stb(3);

explain SELECT func1_read_setint_sql_stb(3);
                   QUERY PLAN                    
-------------------------------------------------
 ProjectSet  (cost=0.00..5.27 rows=1000 width=4)
   ->  Result  (cost=0.00..0.01 rows=1 width=0)
 Optimizer: Postgres query optimizer
(3 rows)
Therefore, I think we can not remove the snapshot before the parse stage.

stable, volatile is constraint for execution stage. Now it is planning stage, things become very tricky.

Better solution should be:

do not get distributed snapshot in the main code path
only fetch a new snapshot when SPI evaluate SQL for stable function. (rare cases)

Aug 25 '22 02:08 kainwen

do not get distributed snapshot in the main code path only fetch a new snapshot when SPI evaluate SQL for stable function. (rare cases)

But if we do so, we need to maintain every place that need distributed snapshot in the planning stage, we could not promise we'll not make mistake when we add new features. Although it's right, but it'll bring a lot of mental burdens.

Aug 25 '22 03:08 SmartKeyerror

This PR has been silent for a long time, have any new questions or thoughts? @avamingli @kainwen @ashwinstar

Sep 19 '22 02:09 SmartKeyerror

I have a doubt that how to ensure the sqls to test our codes actually.

-- test the distributed snapshot in the situation of direct dispatch create table direct_dispatch_snapshot_alpha(a int, b int); create table direct_dispatch_snapshot_beta(a int, b int);

As before we fix the blockstate of txn, they were there but actually didn't behave as expected.

Others look good, thanks.

Sep 20 '22 01:09 avamingli

I have updated the test cases, if somebody has time, take a look, thanks.

Sep 21 '22 05:09 SmartKeyerror

Some comments:

can update|delete|insert benefited by this idea?
can we make the function checkNeedDistributedSnapshot a static bool

Sep 22 '22 02:09 kainwen

Some comments:

can update|delete|insert benefited by this idea?

can we make the function checkNeedDistributedSnapshot a static bool

No, because update|delete|insert will not use PORTAL_ONE_SELECT strategy.
Since all functions in src/include/cdb/cdbtm.h is extern, just want to keep it consistent.

Sep 22 '22 02:09 SmartKeyerror

No, because update|delete|insert will not use PORTAL_ONE_SELECT strategy.

Can we take a look at which strategy they use and if we can use the same idea there.

Sep 22 '22 02:09 kainwen

No, because update|delete|insert will not use PORTAL_ONE_SELECT strategy.

Can we take a look at which strategy they use and if we can use the same idea there.

It'll get snapshot in the function PortalRunMulti()

Sep 22 '22 03:09 SmartKeyerror

LGTM.

Oct 11 '22 06:10 haolinw

gpdb gpdb copied to clipboard

Not grab distributed snapshot if it's direct dispatch

gpdb
gpdb copied to clipboard