llvm-project icon indicating copy to clipboard operation
llvm-project copied to clipboard

[LoopVectorize] Clang fails to vectorize simple polynomial hash

Open xortator opened this issue 3 years ago • 1 comments

Here is example of a simple polynomial hash that could not be auto-vectorized with SSE 4.1 (LV could not prove legality):

https://godbolt.org/z/86o9zPT51 Original test:

unsigned rabin_karp_naive(unsigned *s, unsigned len) {
    unsigned hash = 0;
    for (unsigned i = 0; i < len; i++) {
        hash *= 31;
        hash += s[i];
    }
    return hash;
}

This is a very typical example of polynomial hash, used, for example, in Rabin-Karp algorithm (https://en.wikipedia.org/wiki/Rabin%E2%80%93Karp_algorithm). Another iconic application is Java String hashcode.

When we restructure the code into its equivalent, LV can vectorize it:

unsigned rabin_karp_vector(unsigned *s, unsigned len) {
    unsigned hash = 0;
    unsigned k[4] = {31 * 31 * 31, 31 * 31, 31, 1};
    const unsigned vector_factor = 4;
    unsigned i = 0;
    for (i = 0; i + 3 < len; i += 4) {
        for (unsigned j = 0; j < 4; j++) {
            hash += k[j] * s[i + j];
        }
        for (unsigned j = 0; j < 4; j++) {
            k[j] *= 31 * 31 * 31 * 31;
        }
    }
    for (; i < len; i++) {
        hash *= 31;
        hash += s[i];
    }
    return hash;
}

It would be nice if the original example was vectorizable.

xortator avatar Aug 08 '22 09:08 xortator

Another (also pretty straightforward) version of same algorithm doesn't vectorize either:

unsigned rabin_karp_naive_v2(unsigned *s, unsigned len) {
    unsigned hash = 0;
    unsigned mul = 1;
    for (unsigned i = 0; i < len; i++) {
        hash += mul * s[len - 1 - i];
        mul *= 31;
    }
    return hash;
}

xortator avatar Aug 15 '22 08:08 xortator