gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

Vacuum index on Append-Optimized table enhancement backport.

Open haolinw opened this issue 2 years ago • 1 comments

This is back-porting VACUUM indexes enhancement https://github.com/greenplum-db/gpdb/pull/13255

In the past, we rely on the visibility map to check if an index entry can be removed. Scanning VM is not such effective. With this new strategy, we do index vacuum iff after we recycle a PendingDrop segment, and think the index entry is removable only check if it points to a dropable segment, so the checking would be effective, and dropable segments also guarantee the index entry is invisible to anyone already.

Here are some reminders before you submit the pull request

  • [ ] Add tests for the change
  • [ ] Document changes
  • [ ] Communicate in the mailing list if needed
  • [ ] Pass make installcheck
  • [ ] Review a PR in return to support the community

haolinw avatar Jul 22 '22 09:07 haolinw

Ignore the below post, things are working fine gp_ao_or_aocs_seg() only reports QDs state, and that caused the confusion. When manually running select * pg_aoseg table to look at segments value provided proper result. So, in presence of a concurrent read transaction, the drop phase is skipped and the file remains in AWAITING_DROP on segments and also in QD's in-memory.

~Not the fault of this PR, though I was playing with it to see if the in-memory state on QD is properly reflecting the state of QE, as it's important. I found when there is a concurrent read lock held, QD state is remaining 6 (COMPACTED_AWAITING_DROP) whereas QEs state is being changed to 1 (AOSEG_STATE_DEFAULT which means available for insert).~

~Once the concurrent read session is completed, things do return back to normal.~

-- @Description Test to validate in-memory state on QD is in sync with QEs
--
CREATE TABLE ao (a INT, b INT) WITH (appendonly=true, orientation=@orientation@);
INSERT INTO ao SELECT i as a, i as b FROM generate_series(1,10) AS i;
DELETE FROM AO;
INSERT INTO ao SELECT i as a, i as b FROM generate_series(10,20) AS i;

-- start another session which will hold read lock
1: BEGIN;
1: SELECT COUNT(*) FROM ao;

-- check in-memory entry on QD
SELECT segno, state, total_tupcount, tuples_added FROM gp_toolkit.__gp_get_ao_entry_from_cache('ao'::regclass) where segno < 5;
SELECT segno, state, tupcount, modcount FROM gp_ao_or_aocs_seg('ao');
VACUUM AO;
SELECT segno, state, total_tupcount, tuples_added FROM gp_toolkit.__gp_get_ao_entry_from_cache('ao'::regclass) where segno < 5;
SELECT segno, state, tupcount, modcount FROM gp_ao_or_aocs_seg('ao');
1: abort;
VACUUM AO;
SELECT segno, state, total_tupcount, tuples_added FROM gp_toolkit.__gp_get_ao_entry_from_cache('ao'::regclass) where segno < 5;
SELECT segno, state, tupcount, modcount FROM gp_ao_or_aocs_seg('ao');
INSERT INTO ao SELECT i as a, i as b FROM generate_series(20,30) AS i;
SELECT segno, state, total_tupcount, tuples_added FROM gp_toolkit.__gp_get_ao_entry_from_cache('ao'::regclass) where segno < 5;
SELECT segno, state, tupcount, modcount FROM gp_ao_or_aocs_seg('ao');

Output

-- @Description Test to validate in-memory state on QD is in sync with QEs
--
CREATE TABLE ao (a INT, b INT) WITH (appendonly=true, orientation=row);
CREATE
INSERT INTO ao SELECT i as a, i as b FROM generate_series(1,10) AS i;
INSERT 10
DELETE FROM AO;
DELETE 10
INSERT INTO ao SELECT i as a, i as b FROM generate_series(10,20) AS i;
INSERT 11

-- start another session which will hold read lock
1: BEGIN;
BEGIN
1: SELECT COUNT(*) FROM ao;
 count 
-------
 11    
(1 row)

-- check in-memory entry on QD
SELECT segno, state, total_tupcount, tuples_added FROM gp_toolkit.__gp_get_ao_entry_from_cache('ao'::regclass) where segno < 5;
 segno | state | total_tupcount | tuples_added 
-------+-------+----------------+--------------
 0     | 1     | 0              | 0            
 1     | 1     | 21             | 0            
 2     | 1     | 0              | 0            
 3     | 1     | 0              | 0            
 4     | 1     | 0              | 0            
(5 rows)
SELECT segno, state, tupcount, modcount FROM gp_ao_or_aocs_seg('ao');
 segno | state | tupcount | modcount 
-------+-------+----------+----------
 1     | 1     | 21       | 3        
(1 row)
VACUUM AO;
VACUUM
SELECT segno, state, total_tupcount, tuples_added FROM gp_toolkit.__gp_get_ao_entry_from_cache('ao'::regclass) where segno < 5;
 segno | state | total_tupcount | tuples_added 
-------+-------+----------------+--------------
 0     | 1     | 0              | 0            
 1     | 6     | 21             | 0 
 2     | 1     | 11             | 0            
 3     | 1     | 0              | 0            
 4     | 1     | 0              | 0            
(5 rows)
SELECT segno, state, tupcount, modcount FROM gp_ao_or_aocs_seg('ao');
 segno | state | tupcount | modcount 
-------+-------+----------+----------
 1     | 1     | 21       | 3        
 2     | 1     | 11       | 0        
(2 rows)
1: abort;
ABORT
VACUUM AO;
VACUUM
SELECT segno, state, total_tupcount, tuples_added FROM gp_toolkit.__gp_get_ao_entry_from_cache('ao'::regclass) where segno < 5;
 segno | state | total_tupcount | tuples_added 
-------+-------+----------------+--------------
 0     | 1     | 0              | 0            
 1     | 1     | 0              | 0            
 2     | 1     | 11             | 0            
 3     | 1     | 0              | 0            
 4     | 1     | 0              | 0            
(5 rows)
SELECT segno, state, tupcount, modcount FROM gp_ao_or_aocs_seg('ao');
 segno | state | tupcount | modcount 
-------+-------+----------+----------
 1     | 1     | 0        | 3        
 2     | 1     | 11       | 0        
(2 rows)
INSERT INTO ao SELECT i as a, i as b FROM generate_series(20,30) AS i;
INSERT 11
SELECT segno, state, total_tupcount, tuples_added FROM gp_toolkit.__gp_get_ao_entry_from_cache('ao'::regclass) where segno < 5;
 segno | state | total_tupcount | tuples_added 
-------+-------+----------------+--------------
 0     | 1     | 0              | 0            
 1     | 1     | 11             | 0            
 2     | 1     | 11             | 0            
 3     | 1     | 0              | 0            
 4     | 1     | 0              | 0            
(5 rows)
SELECT segno, state, tupcount, modcount FROM gp_ao_or_aocs_seg('ao');
 segno | state | tupcount | modcount 
-------+-------+----------+----------
 1     | 1     | 11       | 4        
 2     | 1     | 11       | 0        
(2 rows)

ashwinstar avatar Aug 09 '22 23:08 ashwinstar