gpdb
gpdb copied to clipboard
Maybe we should not cancel unfinished parallel retrieve cursor when retrieve connection exiting?
Behavior
When using parallel retrieve cursor, for example, there is a table with 10,000 rows of data.
create table t1(a int);
insert into t1 select generate_series(1,10000);
But If I only want to fetch 100 rows of data (behavior similar to 'SELECT * FROM xxx LIMIT 100'), then after I execute retrieve 100 from endpoint xxx, I have already obtained enough data and do not need to continue retrieving.
session 1: normally connect to master
postgres=# begin;
BEGIN
postgres=# declare c1 PARALLEL RETRIEVE CURSOR FOR select * from t1;
DECLARE PARALLEL RETRIEVE CURSOR
postgres=# select gp_segment_id,auth_token, endpointname, port FROM pg_catalog.gp_endpoints;
gp_segment_id | auth_token | endpointname | port
---------------+----------------------------------+--------------------+-------
0 | f4851109b3525baedf634577572fdd53 | c10000005f0000001a | 57343
1 | a71ca4c354d80d97ba8777f5be0c8413 | c10000005f0000001a | 57344
2 | 887f6f7644bd2be3944f60e7933c33ce | c10000005f0000001a | 57345
(3 rows)
session2: PGOPTIONS='-c gp_retrieve_conn=true' psql -p 57343
retrieve 100 from endpoint c10000005f0000001a;
Therefore, I disconnect the retrieve connection, but I encounter an error when closing c1, which is "ERROR: canceling MPP operation: 'Endpoint retrieve session is quitting. All unfinished parallel retrieve cursors on the session will be terminated.'"
session2 quit
session1
postgres=# close c1;
ERROR: canceling MPP operation: "Endpoint retrieve session is quitting. All unfinished parallel retrieve cursors on the session will be terminated." (seg0 192.168.31.128:57343 pid=70221)
Enhancement
I believe this should not be considered an error; retrieving partial data and then immediately exiting upon completion is a valid requirement. The current approach leads to the aborting of the entire transaction. For this kind of behavior, where partial retrieval is followed by an exit, using QUERY_FINISH may be more reasonable. I will submit a PR to explain this further.
https://github.com/greenplum-db/gpdb/pull/17344
@lmzzzzz1 Thanks for your feedback, I reproduced it successfully and looks the behavior is need to be improved:
The error message is a little weird: Endpoint retrieve session is quitting ..., but session2 has already quit. We will figure out its original design thoughts.
@zxuejing please help to take a look at it when you are free.
@zxuejing please help to take a look at it when you are free.
Ok, I will take a look at it!