ComputeSharp icon indicating copy to clipboard operation
ComputeSharp copied to clipboard

ThreadIds.Normalized doesn't give the full 0-1 range

Open morphinapg opened this issue 2 years ago • 3 comments

Description (optional)

It appears ThreadIds.Normalized is calculated as Index / Count, rather than Index / (Count-1). So the maximum index never gives a ThreadIds.Normalized of 1.

Reproduction Steps

I'm working on a shader that will help me process color information for a 3D LUT, so I'm using a 32x32x32 float4 texture, and when I save the ThreadIds.Normalized value of X, Y, and Z for each item in the texture, the maximum value I ever see is 0.96875.

As we can see,

0.96875 * 32 = 31

So it's clear that ThreadIds.Normalized is taking the index (31) and dividing it by the size of the axis (32).

However, this means the values aren't fully normalized, since the maximum value does not equal 1.0.

Expected Behavior

The maximum value for ThreadIds.Normalized should be 1

Actual Behavior

The maximum value is less than 1. The longer the axis is, the closer it will get to 1, but it will never reach it, except maybe through rounding if the axis is very large.

System info

This section should contain useful info such as:

  • ComputeSharp 2.0.0-alpha.23
  • Windows 11
  • Ryzen 7 2700x / RTX 3080 Ti
  • Visual Studio 2022 17.2.0 / .NET 6

Additional context (optional)

I was able to get around this by using

float
                    R = (float)ThreadIds.X / (DispatchSize.X - 1),
                    G = (float)ThreadIds.Y / (DispatchSize.Y - 1),
                    B = (float)ThreadIds.Z / (DispatchSize.Z - 1);

morphinapg avatar May 15 '22 13:05 morphinapg

Ah, you're right, good catch! The fix will be to just decrement that by one, yup. Thanks! 😄

Sergio0694 avatar May 15 '22 21:05 Sergio0694

This is actually being slightly trickier to fix given that I also have a whole lot of unit tests that were relying on the previous behavior, including some using images stored as assets into the various projects that I'll need to regenerate. Will take a bit 🥲

Sergio0694 avatar May 24 '22 20:05 Sergio0694

So, I've worked on this a little bit more now that other work is done, but I'm struggling to fix all tests. There's one that was failing (Transition_ReadWriteTexture3D) that's flat out just returning a blank image in the second layer of the 3D texture, and I'm not sure why. The generated HLSL is this, and the only difference is the -1 in this new version:

#define __GroupSize__get_X 4
#define __GroupSize__get_Y 4
#define __GroupSize__get_Z 4

cbuffer _ : register(b0)
{
    uint __x;
    uint __y;
    uint __z;
}

SamplerState __sampler : register(s);

Texture3D<float4> source : register(t0);

RWTexture3D<unorm float4> destination : register(u0);

[NumThreads(__GroupSize__get_X, __GroupSize__get_Y, __GroupSize__get_Z)]
void Execute(uint3 ThreadIds : SV_DispatchThreadID)
{
    if (ThreadIds.x < __x && ThreadIds.y < __y && ThreadIds.z < __z)
    {
        destination[ThreadIds.xyz] = source.SampleLevel(__sampler, float3(ThreadIds.x, ThreadIds.y, ThreadIds.z) / float3(__x - 1, __y - 1, __z - 1), 0);
    }
}

I'm not entirely sure why this causes the image to be messed up... 🤔

Actually, here z is 2. So I guess the dispatch values could be 0 and 1 before, resulting in 0 or 0.5, whereas now they'll be 0 and 1. Not entirely sure why 1 would cause this, unless that was because it's technically out of bounds given the range is exclusive.

Mmh...

Sergio0694 avatar Aug 05 '22 14:08 Sergio0694

Alright, sorry for the delay, this is now finally completed! 🎉 Thank you for the bug report @morphinapg!

Sergio0694 avatar Sep 27 '22 10:09 Sergio0694