phoenix
phoenix copied to clipboard
PHOENIX-6587 Disallow specifying explicit pre-split on salted tables
Do we already prevent create table on preexisting table with salt set?
Do we already prevent create table on preexisting table with salt set?
I wouldn't want to prevent that. Copying/Loading a Phoenix data table with HBase tools into a cluster, and then running create table to re-create the Phoenix metadata is a supported and standard procedure, (even if I have seen people messing that up.)
I haven't tested it, but as we're not creating a new HBase table, we probably either error out, or just ignore the pre-split points in this case. I'm going to try out what happens in this case.
Do we already prevent create table on preexisting table with salt set?
I wouldn't want to prevent that. Copying/Loading a Phoenix data table with HBase tools into a cluster, and then running create table to re-create the Phoenix metadata is a supported and standard procedure, (even if I have seen people messing that up.)
I haven't tested it, but as we're not creating a new HBase table, we probably either error out, or just ignore the pre-split points in this case. I'm going to try out what happens in this case.
I know we had a case in pre-production where we got region boundaries that overlapped the salt boundaries via restore or manual create table ontop of existing table. This led to wrong results due to phoenix's explicit assumption on the salted boundaries. Maybe a better approach for this worry is to check during create table that the table is split sanely if its salted etc.
thank you @dbwong
Maybe a better approach for this worry is to check during create table that the table is split sanely if its salted etc.
I have started working on this, but got sidetracked by a deadline on another project. I'm about 50% done with the split verification solution, and hope to get that ready and tested by early next year.
Note that new patch removes the SchemaUtil.processSplit() logic, which attempts to expend the split point to cover the whole PK.
My reasons removing it:
- It uses arbitrary lengths when working with PKs with variable length fields, and does not (and can not, because the length is not fixed) cover the actual PK.
- I could not figure why it was there in the first place.
- It made adding the validation logic awkward
Some more reasoning to remove SchemaUtil.processSplit() :
AFAICT it tries to emulate how HBase would automatically split a region. However, I could not find any reason why we would want to do that.
- It is not necessary for correctness
- It always adds an extra region to scan, as a salt key will always extend to an extra region (but this extra region is guaranteed to not have any rows)
There two outstanding problems:
- Upgrade tests fail, because the split points on the System tables use the old longer split points, and the update code errors out.
- org.apache.phoenix.end2end.LocalIndexSplitMergeIT.testLocalIndexScanWithMergeSpecialCase() fails, but I cannot figure out why. Maybe the local index code dependes on the longer split points feature that I have removed ?
Both could probably be fixed by re-adding the split point extesion code, but I'd like to understand it more first.
org.apache.phoenix.end2end.LocalIndexSplitMergeIT.testLocalIndexScanWithMergeSpecialCase() doesn't even use salted tables. I must have broken something else in the patch.
Turns out that split processing is not necessary for salted tables, but rather for all tables that have local indexes, which LocalIndexSplitMergeIT.testLocalIndexScanWithMergeSpecialCase() has luckily caught.
This of course means that we must keep the split processing.
This also means that we are already silently changing the manually specified split points. I think that we should just fix the split points specified for salted tables, as there is precedence for similar behaviour, and I think that this is the closet thing to doing what the user wanted.
Could you please review the new version @dbwong ?
However, I need to update the ticket and patch description, as the patch does kind of the opposite of what it says.
Squashed and renamed the patch.
PHOENIX-6910 makes this moot.