renderdoc
renderdoc copied to clipboard
Allow inspecting groupshared memory in D3D11 shader debugging
Description
I don't know whether this is implemented already and I just don't know how to use it or a genuine bug, so it's more of a question-bug.
The first issue is seeing values such as vThreadID or vThreadGroupID in the compute shader debugger. I think other semantics aren't easily visible either (i.e. the instance id) although from context are more easily deducible. Hovering on top of the name or seeing it in Variables (or both) would be very useful.
The other issue is also related to data visibility, and it's seeing groupshared memory as a compute shader runs. I understand that maybe the contents of memory from the full run can't be inspected (as in, it wouldn't show things like race conditions, etc.) but at least the value that the current thread is writing would be quite useful. Perhaps it would appear in the variables window as well like the registers or in the Constants && Resources.
One other question that is related is how to use the Watch window. Is there a way to add anything other than registers? For instance, adding e.g. g0[10] for the groupshared memory to be able to observe how that particular entry evolves?
Environment
- RenderDoc build: 1.0
- Operating System: Windows 7
- API: D3D11
About vThreadID and friends, those weren't omitted for any good reason. I had skipped them since I'd thought of them as constants - e.g. in compute shaders you choose which group and thread to debug. I realise that's not very useful though, so I think the recent revamp to allow HLSL debugging should now display them properly, although I think I might still have a bug or two in that area.
With groupshared memory it's more tricky. To clarify; the compute debugging in RenderDoc is currently limited to running a single thread at once, so groupshared memory is simulated internally but it's kind of trivial because there aren't any reads/writes from other threads. In most cases that means debugging shaders that use groupshared will rapidly turn into nonsense. You can use the 'run to load/sample/gather' button which will also pick up loads from groupshared, but it'll only have the correct value if no other thread was expected to write to it.
Displaying the contents in the UI is the challenge. The debugging works by running the full simulation and snapshotting the state at each step. That's not too bad if you're only talking about a dozen or so registers, but if you include up to 32kb of groupshared then it gets exponentially worse.
Changing to a more iterative approach where simulation only happens as needed would solve that, but then you have issues with performance when running over again repeating work. If you cache to solve that, you start going back into the same problem. It also becomes harder to support stepping backwards which I think is a very useful feature especially for shaders.
I think probably the best solution is to instead store each step as a delta (either a two-way delta or with regular snapshots to allow rewind) since each individual step doesn't change much. That's going to need a very different representation though. Since I'm hoping to look at Vulkan shader debugging later in the year, I'll need to refactor a lot of this code anyway to support both and it would make sense to re-examine this at that time. For the moment though I think groupshared is probably going to remain opaque.
I see you've addressed the thread id/group quite quickly so thanks for that!
Regarding the groupshared thanks for the explanation, I can better understand now how it can be quite tricky to implement for the entire shader run in the current implementation. Hopefully when you get to getting Vulkan debugging done it can be piggybacked on that task.
Oh groupshared, how I long for thee. The doctor of rendering I beseech.
Even being able to edit registers and the groupshared memory manually to simulate some shader state would be extremely useful tbh, irregardless of more complex solutions.
Editing registers and groupshared memory would be much more work on top of everything mentioned above and opens a lot of other UI pains: if you step forward, edit a value, then step back past where you edited and run again - does it use the edited value or the actual value? It's a sort of microcosm of the kind of semantic difficulties for state editing.
You can hack in groupshared contents if you know what they should be. In D3D11DebugManager::CreateShaderGlobalState each groupshared declaration creates a memory buffer that's initially empty:
https://github.com/baldurk/renderdoc/blob/c9c51b5a3eb16c8fe07234fe7feedc58b6153bc1/renderdoc/driver/d3d11/d3d11_shaderdebug.cpp#L600-L615
You could fill that with data at the start if you know what it should be for other threads maybe. Or else you can poke into it in the main loop of D3D11Replay::DebugThread if you know which step or instruction you want to have the results appear:
https://github.com/baldurk/renderdoc/blob/c9c51b5a3eb16c8fe07234fe7feedc58b6153bc1/renderdoc/driver/d3d11/d3d11_shaderdebug.cpp#L2354-L2374
Oh, that is pretty sweet! With editing I meant directly modifying the one and only state of the shader simulator, so I wouldn't expect the shader to behave as expected if I started rewinding it etc. This could easily be indicated by marking the shader state as dirty as soon as you elect to edit, disabling the backwards button for example. It would be super powerful and simple way to do more creative debugging, especially with groupshared stuff.
I will most definitely look into D3D11DebugManager::CreateShaderGlobalState though, thanks for the tip!!
With groupshared memory it's more tricky. To clarify; the compute debugging in RenderDoc is currently limited to running a single thread at once
Does the same apply to pixel shaders? Thinking of derivative instructions, visualizing the value of a variable across invocations etc.