phoenix
phoenix copied to clipboard
PHOENIX-7593: Enable CompactionScanner for flushes
Not related to this PR, but as a general improvement, this method should not be named as isEmptyColumn() because it does not perform any empty column related check, all it checks for is whether the given cell has matching CF and CQ:
public static boolean isEmptyColumn(Cell cell, byte[] emptyCF, byte[] emptyCQ) {
return CellUtil.matchingFamily(cell, emptyCF, 0, emptyCF.length) &&
CellUtil.matchingQualifier(cell, emptyCQ, 0, emptyCQ.length);
}
We should remove the above utility because HBase CellUtil already provides exactly the same:
public static boolean matchingColumn(final Cell left, final byte[] fam, final byte[] qual) {
return matchingFamily(left, fam) && matchingQualifier(left, qual);
}
(worth doing as separate Jira/PR though)
(worth doing as separate Jira/PR though)
Created JIRA: https://issues.apache.org/jira/browse/PHOENIX-7597
@sanjeet006py Can you also do a perf study to rule out any performance degradation that can get introduced in the flushing path. We have some metrics at the regionserver like hbase.regionserver.FlushTime and at per table like hbase.regionserver.Namespace_default_table_<TABLENAME>_metric_flushTime_95th_percentile
@tkhurana @virajjasani the perf analysis is done: https://docs.google.com/document/d/1oQzEMP4LXOFxLHlKt1SZ5uvRLd3Vk90x39gn1hVBn0Y/edit?tab=t.0#heading=h.32xuccojgowv. Overall I see enabling CompactionScanner for flushes will have some overhead (as expected) but no big enough to cause performance degradation. Thanks