DirectXShaderCompiler
DirectXShaderCompiler copied to clipboard
[SPIR-V] Problem loading struct with vk::RawBufferLoad.
struct GPUObjDesc
{
float3 test;
float3 test2;
float3 test3;
soulsl::address vertex_address; // Address of the Vertex buffer
soulsl::address index_address; // Address of the index buffer
soulsl::address material_address; // Address of the material buffer
soulsl::address material_index_address; // Address of the triangle material index buffer
};
GPUObjDesc gpu_obj_desc = vk::RawBufferLoad<GPUObjDesc>(push_constant.gpu_obj_buffer_address);
float3 test2 = vk::RawBufferLoad<float3>(push_constant.gpu_obj_buffer_address + sizeof(float3));
float3 test3 = vk::RawBufferLoad<float3>(push_constant.gpu_obj_buffer_address + 2 * sizeof(float3));
For some reason gpu_obj_desc.test2 and gpu_obj_desc.test3 return the value of gpu_obj_desc.test. But test2 and test3 return the correct value. So it behaves correctly if I load the float3 by offsetting manually. I compile the hlsl with -fvk-use-dx-layout and -fspv-target-env=vulkan1.3. I have bindless mechanism as well that use ByteAddressBuffer.Load<GPUObjDesc>, that mechanism work perfectly fine when I load this same struct.
Hi, Thanks for the report! In order for us to reproduce the issue, we'd like a bit more information from your end:
- dxc version (or commit hash is custom build)
- a shader that compiles, we can directly try with, which reproduces the issue
- full command-line (Example of a good issue summary: https://github.com/microsoft/DirectXShaderCompiler/issues/4911)
Tried to guess a shader using those things (minus soulsl fields since those are not defined here), but OpAccessChain looks fine.
Actually I have look at the disassembly spir-v. And I wonder why it translates to SPIR-V differently from ByteAddressBuffer.Load<T>. In ByteAddressBuffer.Load<T>, dxc will load the member one by one using OpAccessChain (one OpAccessChain per uint) but vk::RawBufferLoad<T> will only use one OpLoad
#define WORK_GROUP_SIZE_X 16
#define WORK_GROUP_SIZE_Y 16
[[vk::binding(0, 0)]] ByteAddressBuffer global_buffer_arr[];
[[vk::binding(0, 3)]] RWTexture2D<float4> global_rw_texture_2d_float4_arr[];
RWTexture2D<float4> get_rw_texture_2d_float4(uint descriptor_id) {
return global_rw_texture_2d_float4_arr[descriptor_id];
}
struct GPUScene
{
float3 sky_color;
float3 cube_color;
float4x4 projection_inverse;
float4x4 view_inverse;
};
struct PushConstant
{
uint64_t scene_gpu_address;
uint scene_descriptor_id;
uint output_texture_descriptor_id;
};
[[vk::push_constant]]
PushConstant push_constant;
[numthreads(WORK_GROUP_SIZE_X, WORK_GROUP_SIZE_Y, 1)]
void cs_main(uint3 dispatch_thread_id : SV_DispatchThreadID)
{
GPUScene scene2 = global_buffer_arr[push_constant.scene_descriptor_id].Load<GPUScene>(0);
GPUScene scene = vk::RawBufferLoad<GPUScene>(push_constant.scene_gpu_address, 4);
float3 output_color = scene2.cube_color + scene.cube_color;
RWTexture2D<float4> output_texture = get_rw_texture_2d_float4(push_constant.output_texture_descriptor_id);
output_texture[uint2(0, 0)] = float4(output_color, 1.0f);
}
Resulting SPIR-V
; SPIR-V
; Version: 1.6
; Generator: Google spiregg; 0
; Bound: 240
; Schema: 0
OpCapability Shader
OpCapability RuntimeDescriptorArray
OpCapability Int64
OpCapability PhysicalStorageBufferAddresses
OpExtension "SPV_EXT_descriptor_indexing"
OpExtension "SPV_KHR_physical_storage_buffer"
OpMemoryModel PhysicalStorageBuffer64 GLSL450
OpEntryPoint GLCompute %cs_main "cs_main" %gl_GlobalInvocationID %global_buffer_arr %global_rw_texture_2d_float4_arr %push_constant
OpExecutionMode %cs_main LocalSize 16 16 1
OpSource HLSL 650
OpName %type_ByteAddressBuffer "type.ByteAddressBuffer"
OpName %global_buffer_arr "global_buffer_arr"
OpName %type_2d_image "type.2d.image"
OpName %global_rw_texture_2d_float4_arr "global_rw_texture_2d_float4_arr"
OpName %type_PushConstant_PushConstant "type.PushConstant.PushConstant"
OpMemberName %type_PushConstant_PushConstant 0 "scene_gpu_address"
OpMemberName %type_PushConstant_PushConstant 1 "scene_descriptor_id"
OpMemberName %type_PushConstant_PushConstant 2 "output_texture_descriptor_id"
OpName %push_constant "push_constant"
OpName %cs_main "cs_main"
OpName %param_var_dispatch_thread_id "param.var.dispatch_thread_id"
OpName %GPUScene "GPUScene"
OpMemberName %GPUScene 0 "sky_color"
OpMemberName %GPUScene 1 "cube_color"
OpMemberName %GPUScene 2 "projection_inverse"
OpMemberName %GPUScene 3 "view_inverse"
OpDecorate %gl_GlobalInvocationID BuiltIn GlobalInvocationId
OpDecorate %global_buffer_arr DescriptorSet 0
OpDecorate %global_buffer_arr Binding 0
OpDecorate %global_rw_texture_2d_float4_arr DescriptorSet 3
OpDecorate %global_rw_texture_2d_float4_arr Binding 0
OpDecorate %_runtimearr_uint ArrayStride 4
OpMemberDecorate %type_ByteAddressBuffer 0 Offset 0
OpMemberDecorate %type_ByteAddressBuffer 0 NonWritable
OpDecorate %type_ByteAddressBuffer Block
OpMemberDecorate %type_PushConstant_PushConstant 0 Offset 0
OpMemberDecorate %type_PushConstant_PushConstant 1 Offset 8
OpMemberDecorate %type_PushConstant_PushConstant 2 Offset 12
OpDecorate %type_PushConstant_PushConstant Block
%int = OpTypeInt 32 1
%int_1 = OpConstant %int 1
%uint = OpTypeInt 32 0
%uint_0 = OpConstant %uint 0
%uint_2 = OpConstant %uint 2
%uint_1 = OpConstant %uint 1
%uint_3 = OpConstant %uint 3
%uint_6 = OpConstant %uint 6
%uint_22 = OpConstant %uint 22
%uint_38 = OpConstant %uint 38
%int_0 = OpConstant %int 0
%int_2 = OpConstant %int 2
%float = OpTypeFloat 32
%float_1 = OpConstant %float 1
%v2uint = OpTypeVector %uint 2
%27 = OpConstantComposite %v2uint %uint_0 %uint_0
%_runtimearr_uint = OpTypeRuntimeArray %uint
%type_ByteAddressBuffer = OpTypeStruct %_runtimearr_uint
%_runtimearr_type_ByteAddressBuffer = OpTypeRuntimeArray %type_ByteAddressBuffer
%_ptr_StorageBuffer__runtimearr_type_ByteAddressBuffer = OpTypePointer StorageBuffer %_runtimearr_type_ByteAddressBuffer
%type_2d_image = OpTypeImage %float 2D 2 0 0 2 Rgba32f
%_runtimearr_type_2d_image = OpTypeRuntimeArray %type_2d_image
%_ptr_UniformConstant__runtimearr_type_2d_image = OpTypePointer UniformConstant %_runtimearr_type_2d_image
%ulong = OpTypeInt 64 0
%type_PushConstant_PushConstant = OpTypeStruct %ulong %uint %uint
%_ptr_PushConstant_type_PushConstant_PushConstant = OpTypePointer PushConstant %type_PushConstant_PushConstant
%v3uint = OpTypeVector %uint 3
%_ptr_Input_v3uint = OpTypePointer Input %v3uint
%void = OpTypeVoid
%37 = OpTypeFunction %void
%_ptr_Function_v3uint = OpTypePointer Function %v3uint
%39 = OpTypeFunction %void %_ptr_Function_v3uint
%v3float = OpTypeVector %float 3
%v4float = OpTypeVector %float 4
%mat4v4float = OpTypeMatrix %v4float 4
%GPUScene = OpTypeStruct %v3float %v3float %mat4v4float %mat4v4float
%_ptr_Function_GPUScene = OpTypePointer Function %GPUScene
%_ptr_Function_v3float = OpTypePointer Function %v3float
%_ptr_Function_type_2d_image = OpTypePointer Function %type_2d_image
%_ptr_Function_uint = OpTypePointer Function %uint
%_ptr_PushConstant_uint = OpTypePointer PushConstant %uint
%_ptr_StorageBuffer_type_ByteAddressBuffer = OpTypePointer StorageBuffer %type_ByteAddressBuffer
%_ptr_StorageBuffer_uint = OpTypePointer StorageBuffer %uint
%_ptr_PushConstant_ulong = OpTypePointer PushConstant %ulong
%_ptr_PhysicalStorageBuffer_GPUScene = OpTypePointer PhysicalStorageBuffer %GPUScene
%52 = OpTypeFunction %type_2d_image %_ptr_Function_uint
%_ptr_UniformConstant_type_2d_image = OpTypePointer UniformConstant %type_2d_image
%global_buffer_arr = OpVariable %_ptr_StorageBuffer__runtimearr_type_ByteAddressBuffer StorageBuffer
%global_rw_texture_2d_float4_arr = OpVariable %_ptr_UniformConstant__runtimearr_type_2d_image UniformConstant
%push_constant = OpVariable %_ptr_PushConstant_type_PushConstant_PushConstant PushConstant
%gl_GlobalInvocationID = OpVariable %_ptr_Input_v3uint Input
%54 = OpUndef %v3float
%55 = OpUndef %mat4v4float
%uint_4 = OpConstant %uint 4
%uint_5 = OpConstant %uint 5
%uint_7 = OpConstant %uint 7
%uint_8 = OpConstant %uint 8
%uint_9 = OpConstant %uint 9
%uint_10 = OpConstant %uint 10
%uint_11 = OpConstant %uint 11
%uint_12 = OpConstant %uint 12
%uint_13 = OpConstant %uint 13
%uint_14 = OpConstant %uint 14
%uint_15 = OpConstant %uint 15
%uint_16 = OpConstant %uint 16
%uint_17 = OpConstant %uint 17
%uint_18 = OpConstant %uint 18
%uint_19 = OpConstant %uint 19
%uint_20 = OpConstant %uint 20
%uint_21 = OpConstant %uint 21
%uint_23 = OpConstant %uint 23
%uint_24 = OpConstant %uint 24
%uint_25 = OpConstant %uint 25
%uint_26 = OpConstant %uint 26
%uint_27 = OpConstant %uint 27
%uint_28 = OpConstant %uint 28
%uint_29 = OpConstant %uint 29
%uint_30 = OpConstant %uint 30
%uint_31 = OpConstant %uint 31
%uint_32 = OpConstant %uint 32
%uint_33 = OpConstant %uint 33
%uint_34 = OpConstant %uint 34
%uint_35 = OpConstant %uint 35
%uint_36 = OpConstant %uint 36
%uint_37 = OpConstant %uint 37
%cs_main = OpFunction %void None %37
%88 = OpLabel
%89 = OpVariable %_ptr_Function_v3float Function
%90 = OpVariable %_ptr_Function_v3float Function
%91 = OpVariable %_ptr_Function_type_2d_image Function
%92 = OpVariable %_ptr_Function_v3float Function
%93 = OpVariable %_ptr_Function_type_2d_image Function
%94 = OpVariable %_ptr_Function_uint Function
%param_var_dispatch_thread_id = OpVariable %_ptr_Function_v3uint Function
%95 = OpLoad %v3uint %gl_GlobalInvocationID
%96 = OpAccessChain %_ptr_PushConstant_uint %push_constant %int_1
%97 = OpLoad %uint %96
%98 = OpAccessChain %_ptr_StorageBuffer_type_ByteAddressBuffer %global_buffer_arr %97
%99 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_0
%100 = OpLoad %uint %99
%101 = OpBitcast %float %100
%102 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_1
%103 = OpLoad %uint %102
%104 = OpBitcast %float %103
%105 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_2
%106 = OpLoad %uint %105
%107 = OpBitcast %float %106
%108 = OpCompositeConstruct %v3float %101 %104 %107
%109 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_3
%110 = OpLoad %uint %109
%111 = OpBitcast %float %110
%112 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_4
%113 = OpLoad %uint %112
%114 = OpBitcast %float %113
%115 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_5
%116 = OpLoad %uint %115
%117 = OpBitcast %float %116
%118 = OpCompositeConstruct %v3float %111 %114 %117
%119 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_6
%120 = OpLoad %uint %119
%121 = OpBitcast %float %120
%122 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_7
%123 = OpLoad %uint %122
%124 = OpBitcast %float %123
%125 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_8
%126 = OpLoad %uint %125
%127 = OpBitcast %float %126
%128 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_9
%129 = OpLoad %uint %128
%130 = OpBitcast %float %129
%131 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_10
%132 = OpLoad %uint %131
%133 = OpBitcast %float %132
%134 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_11
%135 = OpLoad %uint %134
%136 = OpBitcast %float %135
%137 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_12
%138 = OpLoad %uint %137
%139 = OpBitcast %float %138
%140 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_13
%141 = OpLoad %uint %140
%142 = OpBitcast %float %141
%143 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_14
%144 = OpLoad %uint %143
%145 = OpBitcast %float %144
%146 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_15
%147 = OpLoad %uint %146
%148 = OpBitcast %float %147
%149 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_16
%150 = OpLoad %uint %149
%151 = OpBitcast %float %150
%152 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_17
%153 = OpLoad %uint %152
%154 = OpBitcast %float %153
%155 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_18
%156 = OpLoad %uint %155
%157 = OpBitcast %float %156
%158 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_19
%159 = OpLoad %uint %158
%160 = OpBitcast %float %159
%161 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_20
%162 = OpLoad %uint %161
%163 = OpBitcast %float %162
%164 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_21
%165 = OpLoad %uint %164
%166 = OpBitcast %float %165
%167 = OpCompositeConstruct %v4float %121 %133 %145 %157
%168 = OpCompositeConstruct %v4float %124 %136 %148 %160
%169 = OpCompositeConstruct %v4float %127 %139 %151 %163
%170 = OpCompositeConstruct %v4float %130 %142 %154 %166
%171 = OpCompositeConstruct %mat4v4float %167 %168 %169 %170
%172 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_22
%173 = OpLoad %uint %172
%174 = OpBitcast %float %173
%175 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_23
%176 = OpLoad %uint %175
%177 = OpBitcast %float %176
%178 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_24
%179 = OpLoad %uint %178
%180 = OpBitcast %float %179
%181 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_25
%182 = OpLoad %uint %181
%183 = OpBitcast %float %182
%184 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_26
%185 = OpLoad %uint %184
%186 = OpBitcast %float %185
%187 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_27
%188 = OpLoad %uint %187
%189 = OpBitcast %float %188
%190 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_28
%191 = OpLoad %uint %190
%192 = OpBitcast %float %191
%193 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_29
%194 = OpLoad %uint %193
%195 = OpBitcast %float %194
%196 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_30
%197 = OpLoad %uint %196
%198 = OpBitcast %float %197
%199 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_31
%200 = OpLoad %uint %199
%201 = OpBitcast %float %200
%202 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_32
%203 = OpLoad %uint %202
%204 = OpBitcast %float %203
%205 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_33
%206 = OpLoad %uint %205
%207 = OpBitcast %float %206
%208 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_34
%209 = OpLoad %uint %208
%210 = OpBitcast %float %209
%211 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_35
%212 = OpLoad %uint %211
%213 = OpBitcast %float %212
%214 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_36
%215 = OpLoad %uint %214
%216 = OpBitcast %float %215
%217 = OpAccessChain %_ptr_StorageBuffer_uint %global_buffer_arr %97 %uint_0 %uint_37
%218 = OpLoad %uint %217
%219 = OpBitcast %float %218
%220 = OpCompositeConstruct %v4float %174 %186 %198 %210
%221 = OpCompositeConstruct %v4float %177 %189 %201 %213
%222 = OpCompositeConstruct %v4float %180 %192 %204 %216
%223 = OpCompositeConstruct %v4float %183 %195 %207 %219
%224 = OpCompositeConstruct %mat4v4float %220 %221 %222 %223
%225 = OpCompositeConstruct %GPUScene %108 %118 %171 %224
OpStore %90 %118
%226 = OpAccessChain %_ptr_PushConstant_ulong %push_constant %int_0
%227 = OpLoad %ulong %226
%228 = OpBitcast %_ptr_PhysicalStorageBuffer_GPUScene %227
%229 = OpLoad %GPUScene %228 Aligned 4
%230 = OpCompositeExtract %v3float %229 1
OpStore %89 %230
%231 = OpFAdd %v3float %118 %230
OpStore %92 %231
%232 = OpAccessChain %_ptr_PushConstant_uint %push_constant %int_2
%233 = OpLoad %uint %232
OpStore %94 %233
%234 = OpAccessChain %_ptr_UniformConstant_type_2d_image %global_rw_texture_2d_float4_arr %233
%235 = OpLoad %type_2d_image %234
OpStore %91 %235
OpStore %93 %235
%236 = OpCompositeExtract %float %231 0
%237 = OpCompositeExtract %float %231 1
%238 = OpCompositeExtract %float %231 2
%239 = OpCompositeConstruct %v4float %236 %237 %238 %float_1
OpImageWrite %235 %27 %239 None
OpReturn
OpFunctionEnd
Version: dxcompiler.dll: 1.7 - 1.7.2212.12 (8c9d92be7); dxil.dll: 1.7(101.7.2212.14) Command : ./dxc.exe -HV 2021 -spirv -fvk-use-dx-layout -fspv-target-env="vulkan1.3" -fspv-extension=SPV_EXT_descriptor_indexing -fspv-extension=SPV_KHR_physical_storage_buffer -E cs_main -T cs_6_5 .\test.hlsl -Fo .\test.spv -Fc .\test.ds %229 is scene and It does correctly extract the first element for %230(scene.cube_color) Will this method use the layout we indicate to compiler, for example if I request to use dxc_layout. Loading the member one by one by ourself will compile to similar code that use ByteAddressBuffer.Load, so I guess that's why it works.
Is there any reason, why it doesn't load the memory the same way ByteAddressBuffer.Load does?
Buffer Reference is in my opinion one of the top feature of GLSL/Vulkan, and this vk::RawBufferLoad
feature is really great. Unfortunately there is this bug with structures.
As an example, let's take this structure
struct TestBuffer
{
vec3 a;
vec3 b;
};
And fill TestBuffer[2] with the values as {{{1,2,3},{4,56}},{{7,8,9},{10,11,12}}}
In the shader, we can retrieve the values as such
TestBuffer test = vk::RawBufferLoad < TestBuffer > (pushConst.bufAddress);
printf("a: %.1f %.1f %.1f\n", test.a.x, test.a.y, test.a.z);
printf("b: %.1f %.1f %.1f\n", test.b.x, test.b.y, test.b.z);
The result show the correct result for the first member, but the following one contains the same data as a
.
a: 1.0 2.0 3.0
b: 1.0 2.0 3.0
Applying an offset will give again the right value for the first member, and copy to all other members.
TestBuffer test = vk::RawBufferLoad < TestBuffer > (pushConst.bufAddress + sizeof(TestBuffer));
...
a: 7.0 8.0 9.0
b: 7.0 8.0 9.0
Note: If the struct contains more members, they all have a copy of what was loaded for the first member. And if a following member is an integer, it is similar as doing asint()
on the value.
The workaround, is to extract each member separately.
float3 a = vk::RawBufferLoad < float3 > (pushConst.bufAddress + 0);
float3 b = vk::RawBufferLoad < float3 > (pushConst.bufAddress + sizeof(float3));
...
a: 1.0 2.0 3.0
b: 4.0 5.0 6.0
But this is not always simple, especially when the structure have members that can be #ifdef
out.
Extra, a very nice addition to vk::RawBufferLoad
would be to support the []
operand, of some other ways to load the n element of the buffer.
Its now broken in new and fun ways
https://godbolt.org/z/onYcs95bc
fatal error: generated SPIR-V is invalid: Structure id 7 decorated as Block for variable in PhysicalStorageBuffer storage class must follow relaxed storage buffer layout rules: member 0 is missing an Offset decoration
%LineStyle = OpTypeStruct %float %_arr_uint_uint_1
if doesn't matter if I annotate my struct members with [[vk::offset(x)]]
or [[vk::ext_decorate(35,x)]]
, I still get this
Here is a sample that shows the problem: https://godbolt.org/z/cbPnTaG45
Note that the "sizeof(float3)" is 12. This means that the offset for the individual loads are at 12 and 24. But the offsets in %GPUObjDesc_0
are 0, 16, and 32. So the values come from different memory locations.
From the description of the vk::RawBufferLoad
, the memory layout of the struct is not specificed.
This is really bad because it makes it impossible to know the memory layout so you can fill in the data on the CPU side.
The only action I think we need to take is to determine which layout is being used, and update the documentation. EDIT: I am concerned that changing the layout at this point will break existing code. Since it is not well documented, I cannot say the current behaviour is incorrect.
It seems to change depending on the layout option used:
default: 16 DX: 16, GL: 16 scalar: 12
This example shows that the byte address buffer uses different layouts than the rawbuffer loads: https://godbolt.org/z/nE4ns5WPe.
default: 16, DX: 12, GL: 16 scalar: 12
Its now broken in new and fun ways
https://godbolt.org/z/onYcs95bc
fatal error: generated SPIR-V is invalid: Structure id 7 decorated as Block for variable in PhysicalStorageBuffer storage class must follow relaxed storage buffer layout rules: member 0 is missing an Offset decoration %LineStyle = OpTypeStruct %float %_arr_uint_uint_1
if doesn't matter if I annotate my struct members with
[[vk::offset(x)]]
or[[vk::ext_decorate(35,x)]]
, I still get this
This seems to have been fixed, but comes back if I launder the address through a bitcast (possible with prop 0011) now https://godbolt.org/z/TnEM3Ka5d
P.S. my Suggestion is to only allow RawBuffer
and friends with -fvk-use-scalar-layout
As with other RawBuffer related issues, we will not be updating that feature. We will leave the layout as it for now.
However, we do need to carefully consider the layout for vk::BufferPointer. The current spec is too vague, and we will have to make it more detailed. The conversation in this issue should inform that conversation.
I opened https://github.com/microsoft/hlsl-specs/issues/306 to follow up on the vk::BufferPointer layout rules.
It also seems to have gotten fixed between 1.8.2405 and 2407