clspv icon indicating copy to clipboard operation
clspv copied to clipboard

kernel lacks LocalSize execution mode if another kernel has reqd_work_group_size

Open dneto0 opened this issue 8 years ago • 1 comments

Here the "align" kernel has a reqd_work_group_size. So it gets a LocalSize execution mode. It also suppresses generation of spec IDs for the components of a work-group-size vector. However, the "boo" kernel is generated without an associated LocalSize execution model. This is an error. The most sane thing to do here is to emit a LocalSize of 1 1 1, and document it.

kernel void __attribute__((reqd_work_group_size(12,2,3))) align(global int* A, int x, float4 c) {
  *A = x + (int)c.x;
}

kernel void  boo(global int* A, int x, float4 c) {
  *A = x + (int)c.x;
}

There's an unfortunate problem in the Vulkan env spec in that a specialization value for workgroup size is freefloating and you can't tell what compute shaders it should affect. So the proposed defaulting is about as good as we can do.

dneto0 avatar Aug 14 '17 20:08 dneto0

For the above example I think you could get away with setting the default spec constant value for gl_WorkGroupSize to 12,2,3, and that way when boo is turned into a VkPipeline you could specify some specialization constants to change the value? The issue with this though is that you can totally mess with the 'required work group size' of align at runtime too.

Realistically we need to fix the spec (somehow in some distant future) to allow this, but for now a bodge is the best we can do!

sheredom avatar Aug 21 '17 14:08 sheredom