NyuziToolchain
NyuziToolchain copied to clipboard
Loop vectorizer bloats code in some cases
The following code:
void* memset(void *dst, int c, unsigned int n)
{
int i;
for (i = 0; i < n; i++)
((char*) dst)[i] = c;
return dst;
}
With the "Loop Vectorization" pass enabled will generate a very large unrolled loop with 1024 instructions.
and s3, s2, 1023
sub_i s3, s2, s3
move s4, 0
b .LBB0_3
.LBB0_3: # %vector.body
# =>This Inner Loop Header: Depth=1
add_i s5, s0, s4
store_8 s1, 1(s5)
store_8 s1, (s5)
store_8 s1, 2(s5)
store_8 s1, 3(s5)
store_8 s1, 4(s5)
store_8 s1, 5(s5)
store_8 s1, 6(s5)
store_8 s1, 7(s5)
...
store_8 s1, 1023(s5)
add_i s4, s4, 1024
cmpne_i s5, s4, s3
bnz s5, .LBB0_3
b .LBB0_4
Also, it doesn't seem to actually be using vectors here. Probably need to tweak cost model to discourage it from doing this.
NyuziToolchain/lib/Transforms/Vectorize/LoopVectorize.cpp
This is currently disabled by default in tools/clang/lib/Driver/ToolChains/Clang.cpp.
@@@ -4447,6 -4852,6 +4854,10 @@@ void Clang::ConstructJob(Compilation &C
// selected. For optimization levels that want vectorization we use the alias
// option to simplify the hasFlag logic.
bool EnableVec = shouldEnableVectorizerAtOLevel(Args, false);
++
++ // XXX Nyuzi
++ EnableVec = false;
++
OptSpecifier VectorizeAliasOption =
EnableVec ? options::OPT_O_Group : options::OPT_fvectorize;
if (Args.hasFlag(options::OPT_fvectorize, VectorizeAliasOption,