amazon-redshift-utils
amazon-redshift-utils copied to clipboard
Vacuum ReIndex is running repeatedly for certain tables
I have used the script that was made available under the path amazon-redshift-utils/src/AnalyzeVacuumUtility/lib/analyze_vacuum.py
As per my understanding we are mainly depending on interleaved_skew value from svv_interleaved_columns which is the ratio of skew present in interleaved sort columns(> 1.4)
I have run the script once and some 30 tables in the schema have gone through vacuum ReIndex consuming 3 hours of time and that was good as it was for the first time I have run the vacuum ReIndex, But it was very unobvious and unfortunate that even when I ran the script for 2nd time with no new inserts or updates on the tables the script again picked the same tables and ran the vacuum ReIndex for 3 more hours..
I have gone through the log and found that the interleaved_skew value didn't go down after 1st vacuum ReIndex and so as the interleaved_skew is > 1.4 they are being picked again and vacuum ReIndex ran against them.
Is this an expected behavior or are we fooling ourselves by using interleaved_skew as the value of consideration? Do we have any other stat that we could rely on to decide if a table should go through vacuum ReIndex other than interleaved_skew? And I have tried to know about svv_interleaved_counts and found probably nothing.. It would help if there is little explanation on it.