DirectXShaderCompiler icon indicating copy to clipboard operation
DirectXShaderCompiler copied to clipboard

ExecutionTest::ComputeSampleTest has invalid verification logic causing failures with mesh shaders

Open jenatali opened this issue 2 years ago • 0 comments

The relevant function:

void VerifySampleResults(const UINT *pPixels, UINT width) {
  UINT xlod = 0;
  UINT ylod = 0;
  // Each pixel contains 4 samples and 4 LOD calculations.
  // 2 of these (called 'left' and 'right') have X values that vary and a constant Y
  // 2 others (called 'top' and 'bot') have Y values that vary and a constant X
  // Only of the X variant sample results and one of the Y variant results
  // are actually reported for the pixel.
  // The other 2 serve as "helpers" to the other pixels in the quad.
  // On the left side of the quad, the 'left' samples are reported.
  // Op the top of the quad, the 'top' samples are reported and so on.
  // The varying coordinate values alternate between zero and a
  // value whose magnitude increases with the index.
  // As a result, the LOD level should steadily increas.
  // Due to vagaries of implementation, the same derivatives
  // in both directions might result in different levels for different locations
  // in the quad. So only comparisons between sample results and LOD calculations
  // and ensuring that the LOD increased and reaches the max can be tested reliably.
  for (unsigned i = 0; i < width; i++) {
    // CalculateLOD and Sample from texture with mip levels containing LOD index should match
    VERIFY_ARE_EQUAL(pPixels[4*i + 0], pPixels[4*i + 1]);
    VERIFY_ARE_EQUAL(pPixels[4*i + 2], pPixels[4*i + 3]);
    // Make sure LODs are ever climbing as magnitudes increase
    VERIFY_IS_TRUE(pPixels[4*i] >= xlod);
    xlod = pPixels[4*i];
    VERIFY_IS_TRUE(pPixels[4*i + 2] >= ylod);
    ylod = pPixels[4*i + 2];
  }
  // Make sure we reached the max lod level for both tracks
  VERIFY_ARE_EQUAL(xlod, 6u);
  VERIFY_ARE_EQUAL(ylod, 6u);
}

This iterates "pixel" by pixel ("pixel" because it's a 1D dispatch, which is laid out as if it were operating on quads) and verifies the 4 components, as:

  1. The LOD that would be computed based on the X derivative between that pixel and 0.
  2. The actual LOD that a sample operation used with the same coordinates of 1, by loading from a texture with different values in each mip.
  3. Same as 1 but for Y dimension
  4. Same as 2 but for Y dimension

Since it's iterating pixel by pixel, this loop will go from top-left of a quad, to top-right, to bottom-left, to bottom-right, and then back to top-left of a new quad. So when going from pixel 0 to pixel 1, it's expected that the X-derivative-LOD should not decrease, since pixel 1's X coordinate increased, and therefore that pixel's derivative vs 0 increased. But going from pixel 1 to 2... it is absolutely valid for the X-derivative-LOD to decrease, because pixel 2's X coordinate is less than pixel 1's.

This logic is incorrect for both compute and mesh, but for compute, it coincidentally works, because pixel 1's X-derivative-LOD is also 0, because the 1/336 difference between 0 and pixel 1's X coordinate is not sufficient to change LODs. For mesh, however, since the limit of a thread group size is 128, the validation is done using the smaller range of 116, and 1/116 is sufficient to change LOD for pixel 1.

This validation should instead be iterating quad by quad, and ensuring that when not moving right->left that the X-derivative-LOD doesn't decrease (i.e. ignoring the edge of 1->2 for pixel indices mod 4).

For Y-derivative-LODs, since the logic doesn't strictly iterate row-by-row by quads, but instead along a diagonal, there's no situation where increasing thread index will cause a decrease in the Y coordinate, so there's no real need to change the validation for Y values.

jenatali avatar Nov 14 '22 22:11 jenatali