DirectXShaderCompiler
DirectXShaderCompiler copied to clipboard
Amplification shader repro.: Invalid payload or 'TGSM pointers must originate from an unambiguous TGSM global variable'
I've modified the D3D12DynamicLOD project in the D3D12MeshShaders samples solution to just do one single DispatchMesh(1,1,1) command that should launch just one of following modified amplification shader thread groups, which should then populate the payload data with 4 instance structures and launch 4 down-stream mesh shader thread groups. Each of the expected 4 mesh shader thread groups should read the same shared payload from their originating up-stream A.S. to extract the instance structure within the payload that's assigned to each M.S. thread group to work on. But the payload seems to be full of nonsense data.
The relevant additions/changes are -
---- d3d12dynamiclod.cpp ----
// for (uint32_t i = 0; i < dispatchCount; ++i)
// {
// uint32_t offset = dispatchCount * i;
// uint32_t count = min(m_instanceCount - offset, c_maxGroupDispatchCount);
//
// m_commandList->SetGraphicsRoot32BitConstant(1, offset, 0);
// m_commandList->SetGraphicsRoot32BitConstant(1, count, 1);
//
// m_commandList->DispatchMesh(count, 1, 1);
// }
m_commandList->DispatchMesh(1, 1, 1);
---- common.hlsl ----
struct TriInstance
{
float4 m_ndcPos;
};
struct SharedPayload
{
uint m_numInstances;
uint3 m_pad;
TriInstance m_instances[64];
};
struct TriInstance
{
float4 m_ndcPos;
};
---- MeshletAS.hlsl ----
groupshared SharedPayload g_sharedPayload;
[RootSignature(ROOT_SIG)]
[NumThreads(8, 8, 1)]
void main( in uint2 groupThreadID : SV_GroupThreadID )
{
g_sharedPayload.m_numInstances = 0;
GroupMemoryBarrierWithGroupSync();
if ( all(groupThreadID < (2u).xx) )
{
TriInstance newInst;
newInst.m_ndcPos = float4( -0.5.xx + (float2)groupThreadID.xy, 0.5, 1.0 );
uint myNewInstanceIdx;
InterlockedAdd( g_sharedPayload.m_numInstances, 1, myNewInstanceIdx );
g_sharedPayload.m_instances[myNewInstanceIdx] = newInst;
}
GroupMemoryBarrierWithGroupSync();
DispatchMesh( g_sharedPayload.m_numInstances, 1, 1, g_sharedPayload );
}
---- MeshletMS.hlsl ----
[RootSignature(ROOT_SIG)]
[NumThreads(64, 1, 1)]
[OutputTopology("triangle")]
void main(
in uint groupID : SV_GroupID,
in payload SharedPayload payloadIn,
out vertices PosOnlyVtx verts[32],
out indices uint3 tris[32])
{
SetMeshOutputCounts(3/*totalVertCount*/, 1/*totalPrimCount*/);
float4 instanceNDCRootPos = payloadIn.m_instances[groupID].m_ndcPos;
verts[0].m_pos = instanceNDCRootPos;
verts[1].m_pos = instanceNDCRootPos + float4(0.0, 0.2, 0.0, 0.0);
verts[2].m_pos = instanceNDCRootPos + float4(0.1, 0.0, 0.0, 0.0);
tris[0] = uint3(0,1,2);
}
---- MeshletPS.hlsl ----
[RootSignature(ROOT_SIG)]
float4 main(in PosOnlyVtx pin) : SV_TARGET
{
return float4( 0.5, 0.1, 1.0, 1.0 );
}
Ignoring the pointlessness of the example, if I also add in a RWStructuredBuffer<float> DebugVals
into which I have the first/[0,0] thread of the A.S. group write all 4 final payload instance elements (after the final GroupMemoryBarrierWithGroupSync()
), i.e. -
if ( all(groupThreadID == (0u).xx) )
{
DebugVals[0] = g_sharedPayload.m_instances[0].m_ndcPos.x;
DebugVals[1] = g_sharedPayload.m_instances[0].m_ndcPos.y;
... etc
I can see in PIX that they’re all nonsense values. However writing plain old immediate values -
...
DebugVals[4] = 123.0;
DebugVals[5] = 456.0;
DebugVals[6] = 789.0;
....
Then they're all present and correct.
So is there a problem with the filling in of the elements of the groupshared playload? I.e. -
g_sharedPayload.m_instances[myNewInstanceIdx] = newInst;
Well, changing just that line to something like this -
g_sharedPayload.m_instances[myNewInstanceIdx].m_ndcPos = 1234.0.xxxx;
now gives a slightly cryptic compiler error -
> dxc.exe /nologo /Emain /Fo bin\x64\Debug\MeshletAS.cso /Od /Zi /Tas_6_5 -Qembed_debug /Fd bin\x64\Debug\MeshletAS.pdb MeshletAS.hlsl
error: validation errors
MeshletAS.hlsl:35:56: error: TGSM pointers must originate from an unambiguous TGSM global variable.
note: at '%22 = getelementptr inbounds [4 x float], [4 x float]
addrspace(3)* %21, i32 0, i32 0' in block '#1' of function 'main'.
Validation failed.
which happens with both -
C:\Program Files (x86)\Windows Kits\10\bin\10.0.20348.0\x64\dxc.exe.
Version: dxcompiler.dll: 1.6 - 1.5.0.2748 (2cad836b2); dxil.dll:
1.6(101.5.2005.60)
and with the 2021-12-08 github dxc –
Version: dxcompiler.dll: 1.6 - 1.6.2112.12 (770ac0cc1); dxil.dll:
1.6(101.6.2112.2)
So, along with this issue, I do wonder whether I’ve somehow stumbled across a few possibly related compiler issues in this AS/MS area. This code is now at the point where it’s getting kind of difficult to try to work around these issues by restructuring the AS threads' method of populating the payload array elements in any more simpler ways.
Cheers
Dan
I have a PR #4452 which fixes related issue #4421. That may fix this issue as well, but I haven't had a chance to construct/try this repro.
Artifacts for the PR build should show up here once the AppVeyor build is done.
If you have a chance to try with dxcompiler.dll from these artifacts and find that this issue no longer repros, please let us know!
Thanks! -Tex
Thanks @tex3d I've just grabbed the latest built artifact referenced by appveyor in the PR you mention and can confirm that -
dxc.exe /nologo /Emain /Fo bin\x64\Debug\MeshletAS.cso /Od /Zi /Tas_6_5 -Qembed_debug /Fd bin\x64\Debug\MeshletAS.pdb MeshletAS.hlsl
appears to succeed (no warning/errors spat out any more). However, I can provoke a new error, very similar to the original repro. with just a very simple change to the above repro code.
Replacing, in the MeshletAS.hlsl, 'main' (with all the changes shown above), the line -
g_sharedPayload.m_instances[myNewInstanceIdx] = newInst;
with -
g_sharedPayload.m_instances[myNewInstanceIdx].m_ndcPos = 1234.0.xxxx;
dxc now produces -
error: validation errors
MeshletAS.hlsl:35:56: error: TGSM pointers must originate from an unambiguous TGSM global variable.
note: at '%18 = getelementptr inbounds [4 x float], [4 x float] addrspace(3)* %17, i32 0, i32 0' in block '#1' of function 'main'.
Validation failed.
I don't know how this kind of stuff is usually handled but since it's been a while without any kind of acknowledgement of this as an outstanding issue, I worry there's a slight possibility it'll be forgotten. Any chance of at least an acknowledgement of this as an ongoing issue that'll be fixed eventually?
This is currently on our list of things we'll try and get to.
As reported, the 2207 release fails with the shader described in this comment: https://godbolt.org/z/4ffG4Ks1G
However the latest release compiles correctly: https://godbolt.org/z/W3c8s7dzY
Closing as resolved