DirectXShaderCompiler
DirectXShaderCompiler copied to clipboard
ExecutionTest::ComputeSampleTest has invalid verification logic causing failures with mesh shaders
The relevant function:
void VerifySampleResults(const UINT *pPixels, UINT width) {
UINT xlod = 0;
UINT ylod = 0;
// Each pixel contains 4 samples and 4 LOD calculations.
// 2 of these (called 'left' and 'right') have X values that vary and a constant Y
// 2 others (called 'top' and 'bot') have Y values that vary and a constant X
// Only of the X variant sample results and one of the Y variant results
// are actually reported for the pixel.
// The other 2 serve as "helpers" to the other pixels in the quad.
// On the left side of the quad, the 'left' samples are reported.
// Op the top of the quad, the 'top' samples are reported and so on.
// The varying coordinate values alternate between zero and a
// value whose magnitude increases with the index.
// As a result, the LOD level should steadily increas.
// Due to vagaries of implementation, the same derivatives
// in both directions might result in different levels for different locations
// in the quad. So only comparisons between sample results and LOD calculations
// and ensuring that the LOD increased and reaches the max can be tested reliably.
for (unsigned i = 0; i < width; i++) {
// CalculateLOD and Sample from texture with mip levels containing LOD index should match
VERIFY_ARE_EQUAL(pPixels[4*i + 0], pPixels[4*i + 1]);
VERIFY_ARE_EQUAL(pPixels[4*i + 2], pPixels[4*i + 3]);
// Make sure LODs are ever climbing as magnitudes increase
VERIFY_IS_TRUE(pPixels[4*i] >= xlod);
xlod = pPixels[4*i];
VERIFY_IS_TRUE(pPixels[4*i + 2] >= ylod);
ylod = pPixels[4*i + 2];
}
// Make sure we reached the max lod level for both tracks
VERIFY_ARE_EQUAL(xlod, 6u);
VERIFY_ARE_EQUAL(ylod, 6u);
}
This iterates "pixel" by pixel ("pixel" because it's a 1D dispatch, which is laid out as if it were operating on quads) and verifies the 4 components, as:
- The LOD that would be computed based on the X derivative between that pixel and 0.
- The actual LOD that a sample operation used with the same coordinates of 1, by loading from a texture with different values in each mip.
- Same as 1 but for Y dimension
- Same as 2 but for Y dimension
Since it's iterating pixel by pixel, this loop will go from top-left of a quad, to top-right, to bottom-left, to bottom-right, and then back to top-left of a new quad. So when going from pixel 0 to pixel 1, it's expected that the X-derivative-LOD should not decrease, since pixel 1's X coordinate increased, and therefore that pixel's derivative vs 0 increased. But going from pixel 1 to 2... it is absolutely valid for the X-derivative-LOD to decrease, because pixel 2's X coordinate is less than pixel 1's.
This logic is incorrect for both compute and mesh, but for compute, it coincidentally works, because pixel 1's X-derivative-LOD is also 0, because the 1/336 difference between 0 and pixel 1's X coordinate is not sufficient to change LODs. For mesh, however, since the limit of a thread group size is 128, the validation is done using the smaller range of 116, and 1/116 is sufficient to change LOD for pixel 1.
This validation should instead be iterating quad by quad, and ensuring that when not moving right->left that the X-derivative-LOD doesn't decrease (i.e. ignoring the edge of 1->2 for pixel indices mod 4).
For Y-derivative-LODs, since the logic doesn't strictly iterate row-by-row by quads, but instead along a diagonal, there's no situation where increasing thread index will cause a decrease in the Y coordinate, so there's no real need to change the validation for Y values.