naga
naga copied to clipboard
`make validate-spv` fails with newest SPIRV-Tools
Using spirv-val
from khronosGroup/SPIRV-Tools@a98f05d0, Naga's make validate-spv
target fails:
$ make validate-spv
Validating spv/access.spvasm
error: line 145: Structure id 12 decorated as Block for variable in Uniform storage class must follow standard uniform buffer layout rules: member 0 is a matrix with stride 8 not satisfying alignment to 16
%Baz = OpTypeStruct %mat3v2float
make: *** [Makefile:34: validate-spv] Error 1
This is a new change; spirv-val
from, say, sdk-1.3.204.1
has no complaints.
See also #2032.
Okay, so the WGSL declarations in access.wgsl
are:
struct Baz {
m: mat3x2<f32>,
}
@group(0) @binding(1)
var<uniform> baz: Baz;
This results in the SPIR-V:
OpMemberName %Baz 0 "m"
OpName %Baz "Baz"
OpName %baz "baz"
...
OpMemberDecorate %Baz 0 Offset 0
OpMemberDecorate %Baz 0 ColMajor
OpMemberDecorate %Baz 0 MatrixStride 8
...
OpDecorate %baz DescriptorSet 0
OpDecorate %baz Binding 1
...
%float = OpTypeFloat 32
%v2float = OpTypeVector %float 2
%mat3v2float = OpTypeMatrix %v2float 3
%Baz = OpTypeStruct %mat3v2float
WGSL's rules say that the RequiredAlignOf(mat3x2<f32>, Uniform)
is AlignOf(mat3x2<f32>)
, which we're satisfying.
Vulkan's shader resource interface offset and stride assignment rules say,
If the
uniformBufferStandardLayout
feature is not enabled on the device, then any member of anOpTypeStruct
with a storage class ofUniform
and a decoration ofBlock
must be aligned according to its extended alignment.
where the "extended alignment" of a matrix member of a struct is
equal to the largest extended alignment of any of its members, rounded up to a multiple of 16.
So it seems like WGSL's requirements are less stringent than Vulkan's.
In Matrix chat, Alan Baker says:
intentionally matrices are tightly packed in WGSL so some matrices in uniform buffers will require an alternative representation
so our SPIR-V generation will need to explode that mat3x2
I was hoping we don't have to do this for the SPIR-V backend as well since you could specify a lower stride and validation didn't complain, but it seems we have to...
It's unfortunate that we now have to do this complex translation in 3 backends (GLSL, SPIR-V, and HLSL) we might want to somehow consolidate this.
So I guess this is what we need to generate now:
OpMemberDecorate %Baz 0 Offset 0
OpMemberDecorate %Baz 1 Offset 8
OpMemberDecorate %Baz 2 Offset 16
%Baz = OpTypeStruct %vec2_f32 %vec2_f32 %vec2_f32
and then loads look like this:
%mat_col0_ptr = OpAccessChain %ptr_vec2_f32 %baz %const_0 %const_0
%mat_col1_ptr = OpAccessChain %ptr_vec2_f32 %baz %const_0 %const_1
%mat_col2_ptr = OpAccessChain %ptr_vec2_f32 %baz %const_0 %const_2
%mat_col0 = OpLoad %vec2_f32 %mat_col0_ptr
%mat_col1 = OpLoad %vec2_f32 %mat_col1_ptr
%mat_col2 = OpLoad %vec2_f32 %mat_col2_ptr
%mat = OpCompositeConstruct %mat3x2_f32 %mat_col0 %mat_col1 %mat_col2
We can no longer assume that Naga IR struct member indices are the same as SPIR-V struct member indices.
Yeah, although there are a bunch of edge-cases (as we've seen in the HLSL PRs that were related to matCx2's). IIRC main one being that wrapping vec2's in a struct might not work in cases where the original matCx2 was already part of a struct since the resulting struct wrapper will have a min alignment requirement of 16 (in uniforms/extended layout) but matCx2's should have an alignment of 8.
It might be helpful to track the HLSL implementation of this while implementing it for the SPIR-V backend.
wrapping vec2's in a struct might not work
Right - looking at the table in the spec, it seems like putting the vec2's in any kind of container causes them to acquire the 16-byte alignment requirement. So the only way to make it work is to just inline them.
One problem with the instruction sequence I proposed earlier is that SPIR-V has no instructions for using a dynamic index into a matrix by value. If you have a pointer to the matrix, then OpAccessChain
works, but otherwise, there's no way to do it. So the actual sequence of instructions would need to follow the OpCompositeConstruct
with a spill to a temporary variable, and then use OpAccessChain
as usual on that.
That's pretty gross.
That's pretty gross.
Confirmed with Tint team that this is what Tint does, and no better solution is known.