phoenix
phoenix copied to clipboard
PHOENIX-1291 Add ILIKE optimization for initial literal part
For optimizing the ILIKE queries, during the optimization phase, a set of ranges (2^n to be precise) are created for limiting results, if the pattern starts with a string literal, as such: 'abc%' will generate 'abc', 'Abc', 'aBc', 'abC', 'ABc', 'AbC', 'aBC', and 'ABC'.
I'm not completely sure if this solution is valid, and I'd like to do a few performance tests on how it performs which I'll share here, but in the meanwhile I'd appreaciate any feedback and suggestions. I mostly followed James Taylor's comments and the discussion in https://issues.apache.org/jira/browse/PHOENIX-1273, which the original ticket refers to.
Also the number of ranges generated grows exponentially as the prefix size is larger, so it may make sense to introduce a reasonable limit on prefix size generated.