Improperly sinking a barrier

Open alan-baker opened this issue 5 years ago • 1 comments

Originally from https://github.com/KhronosGroup/SPIRV-Cross/issues/1511.

#define THREADS 256
#define THREADS_HALF (THREADS/2)

__kernel void main_kernel(__global float* data, __global float* output, int count)
{ 
	__local float list[THREADS];

	const int point1Index = get_global_id(1);
	const int index = get_local_id(0);

	list[index] = data[index];
	
	for (int halfSize = THREADS_HALF; halfSize > 0; halfSize >>= 1) 
	{
		if (index < halfSize)
		{
			const float value1 = list[index];
			const float value2 = list[index + halfSize];
			if (value2 < value1) list[index] = value2;
		}
		barrier(CLK_LOCAL_MEM_FENCE);
	}

	if (!index)
	{
		output[point1Index] = list[0];
	}
}

LLVM loop optimizations transform the loop iteration and sink the barrier into the continue target of the loop. The loop's branch condition ends up based on the local thread id so the workgroup is diverged without a merge instruction.

Oct 28 '20 14:10 alan-baker

It is worth noting that if the barrier is moved before the if statement, clspv ends up with the barrier in the loop header.

Oct 28 '20 14:10 alan-baker