llvm-project
llvm-project copied to clipboard
[LoopVectorize] Clang fails to vectorize simple polynomial hash
Here is example of a simple polynomial hash that could not be auto-vectorized with SSE 4.1 (LV could not prove legality):
https://godbolt.org/z/86o9zPT51 Original test:
unsigned rabin_karp_naive(unsigned *s, unsigned len) {
unsigned hash = 0;
for (unsigned i = 0; i < len; i++) {
hash *= 31;
hash += s[i];
}
return hash;
}
This is a very typical example of polynomial hash, used, for example, in Rabin-Karp algorithm (https://en.wikipedia.org/wiki/Rabin%E2%80%93Karp_algorithm). Another iconic application is Java String hashcode.
When we restructure the code into its equivalent, LV can vectorize it:
unsigned rabin_karp_vector(unsigned *s, unsigned len) {
unsigned hash = 0;
unsigned k[4] = {31 * 31 * 31, 31 * 31, 31, 1};
const unsigned vector_factor = 4;
unsigned i = 0;
for (i = 0; i + 3 < len; i += 4) {
for (unsigned j = 0; j < 4; j++) {
hash += k[j] * s[i + j];
}
for (unsigned j = 0; j < 4; j++) {
k[j] *= 31 * 31 * 31 * 31;
}
}
for (; i < len; i++) {
hash *= 31;
hash += s[i];
}
return hash;
}
It would be nice if the original example was vectorizable.
Another (also pretty straightforward) version of same algorithm doesn't vectorize either:
unsigned rabin_karp_naive_v2(unsigned *s, unsigned len) {
unsigned hash = 0;
unsigned mul = 1;
for (unsigned i = 0; i < len; i++) {
hash += mul * s[len - 1 - i];
mul *= 31;
}
return hash;
}