gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

Maybe we should not cancel unfinished parallel retrieve cursor when retrieve connection exiting?

Open lmzzzzz1 opened this issue 1 year ago • 3 comments
trafficstars

Behavior

When using parallel retrieve cursor, for example, there is a table with 10,000 rows of data.

create table t1(a int);
insert into t1 select generate_series(1,10000);

But If I only want to fetch 100 rows of data (behavior similar to 'SELECT * FROM xxx LIMIT 100'), then after I execute retrieve 100 from endpoint xxx, I have already obtained enough data and do not need to continue retrieving.

session 1: normally connect to master

postgres=# begin;
BEGIN
postgres=# declare c1 PARALLEL RETRIEVE CURSOR FOR select * from t1;
DECLARE PARALLEL RETRIEVE CURSOR
postgres=# select gp_segment_id,auth_token, endpointname, port FROM pg_catalog.gp_endpoints;
 gp_segment_id |            auth_token            |    endpointname    | port
---------------+----------------------------------+--------------------+-------
             0 | f4851109b3525baedf634577572fdd53 | c10000005f0000001a | 57343
             1 | a71ca4c354d80d97ba8777f5be0c8413 | c10000005f0000001a | 57344
             2 | 887f6f7644bd2be3944f60e7933c33ce | c10000005f0000001a | 57345
(3 rows)

session2: PGOPTIONS='-c gp_retrieve_conn=true' psql -p 57343

retrieve 100 from endpoint c10000005f0000001a;

Therefore, I disconnect the retrieve connection, but I encounter an error when closing c1, which is "ERROR: canceling MPP operation: 'Endpoint retrieve session is quitting. All unfinished parallel retrieve cursors on the session will be terminated.'"

session2 quit

session1

postgres=# close c1;
ERROR:  canceling MPP operation: "Endpoint retrieve session is quitting. All unfinished parallel retrieve cursors on the session will be terminated."  (seg0 192.168.31.128:57343 pid=70221)

Enhancement

I believe this should not be considered an error; retrieving partial data and then immediately exiting upon completion is a valid requirement. The current approach leads to the aborting of the entire transaction. For this kind of behavior, where partial retrieval is followed by an exit, using QUERY_FINISH may be more reasonable. I will submit a PR to explain this further.

lmzzzzz1 avatar Apr 12 '24 06:04 lmzzzzz1

https://github.com/greenplum-db/gpdb/pull/17344

lmzzzzz1 avatar Apr 15 '24 07:04 lmzzzzz1

@lmzzzzz1 Thanks for your feedback, I reproduced it successfully and looks the behavior is need to be improved: The error message is a little weird: Endpoint retrieve session is quitting ..., but session2 has already quit. We will figure out its original design thoughts.

@zxuejing please help to take a look at it when you are free.

interma avatar Apr 22 '24 02:04 interma

@zxuejing please help to take a look at it when you are free.

Ok, I will take a look at it!

zxuejing avatar Apr 23 '24 07:04 zxuejing