Signed char extraction fails to take sign into account
kernel void foo_fixed(global uint* A, global char4* B) {
char4 val = *B;
A[0] = val.x;
A[1] = val.y;
A[2] = val.z;
A[3] = val.w;
}
Generates, in part, this code:
%34 = OpShiftRightLogical %uint %33 %uint_0
%35 = OpBitwiseAnd %uint %34 %uint_255
%36 = OpSConvert %uint %34 ; This signed conversion is not enough
OpStore %31 %36
The bug is that the LLVM i8 is converted to a SPIR-V uint, and so the SConvert is from a uint to a uint. This does not do the sign extension that LLVM expects will be implicitly done as part of the conversion.
The clspv compiler needs to insert more instructions to replicate the sign bit from position 7 to all the remainder.
The root cause is the translation of
%conv = sext i8 %1 to i32
The i8 is mapped to a SPIR-V 32-bit unsigned int. But the conversion of the SExt doesn't take that into account.
Original OpenCL C:
kernel void bar(global int* A, global char4* B) {
*A = B->y;
}
last LLVM:
%0 = load <4 x i8>, <4 x i8> addrspace(1)* %B, align 4
%1 = extractelement <4 x i8> %0, i32 1
%conv = sext i8 %1 to i32
store i32 %conv, i32 addrspace(1)* %A, align 4
becomes SPIR-V:
%28 = OpLoad %uint %27
%29 = OpShiftRightLogical %uint %28 %uint_8
%30 = OpBitwiseAnd %uint %29 %uint_255
%31 = OpSConvert %uint %29 ; Not right because source type is too wide in SPIR-V
OpStore %26 %31
There's a corresponding problem with unsigned char.
Also sitofp i8 to float
kernel void bar(global float* A, global char4* B) {
*A = (float)(B->y);
}
Final LLVM:
%1 = extractelement <4 x i8> %0, i32 1
%conv = sitofp i8 %1 to float
store float %conv, float addrspace(1)* %A, align 4
becomes SPIR-V:
%29 = OpLoad %uint %28
%30 = OpShiftRightLogical %uint %29 %uint_8
%31 = OpBitwiseAnd %uint %30 %uint_255
%32 = OpConvertSToF %float %30
I can't reproduce a problem with unsigned char.
Default behaviour (when char support is enabled) is ok, but disabling char support is still problematic.