DirectXShaderCompiler icon indicating copy to clipboard operation
DirectXShaderCompiler copied to clipboard

[Feature Request] DXIL Specialization Constant Support

Open jeremyong-az opened this issue 2 years ago • 7 comments

Currently, DXIL bytecode is emitted by the compiler for every shader permutation in a process typically managed by the user. These permutations are generally controlled via pound-defs specified as arguments to the shader compiler itself.

A preferable model however, would be to leverage a similar model to Vulkan's "specialization constants" which would allow us to generate the bytecode once, and configure the last-mile compilation on the driver with additional PSO state. This would dramatically reduce storage costs for bytecode/pdbs, as well as compilation time.

I'm sure there are many complexities in getting a feature like this over the finish line, but just wanted to mention this in case it could get some traction or understand the roadblocks if its untenable (or if I'm misunderstanding the advantages the approach could confer).

jeremyong-az avatar Oct 04 '21 16:10 jeremyong-az

(discussed offline that this is likely better implemented as a user-space feature, and an open question is whether this fits in the DirectXShaderCompiler project or somewhere else)

jeremyong-az avatar Oct 04 '21 17:10 jeremyong-az

Given that there isn't a DXIL manipulation library available out there with a permissive license, I can't see a separate user-space implementation of the feature to become available any time soon. The closest library we have today is https://github.com/HansKristian-Work/dxil-spirv, which is LGPL and therefore unlikely to be approved for use in something like Unreal Engine.

Edit: Actually I take it back, @baldurk saves the day, as always: https://github.com/baldurk/renderdoc/tree/dc6bc12da77a799d49653e5ec87823c4f599916a/renderdoc/driver/shaders/dxil

Edit2: I take back the takeback :) Apparently, Baldur does not recommend using that framework.

yuriy-odonnell-epic avatar Nov 10 '21 09:11 yuriy-odonnell-epic

FYI I don't really recommend that anyone else use my code for modifying DXIL. I wrote it because I was forced to with no other realistic choice for some debugging workflows, but I don't really trust it to be rock solid and I couldn't in good conscience suggest that anyone else build anything on it without severe "here be dragons" warnings. There's too high a risk of encountering some construct or pattern that isn't handled, or generating one that is invalid, without a reliable way to predict, test, or check for it. DXIL is such a hopeless and unusable format for shader interchange that I wouldn't really trust any external code/library to be robust and reliable except for this specific fork of LLVM.

baldurk avatar Nov 10 '21 10:11 baldurk

To clarify my comment, I think even if it is a "user-space feature", that user-space facility would need to be provided as part of the dxc interface since the implementation is tightly coupled to the IR

jeremyong-az avatar Nov 10 '21 18:11 jeremyong-az

We have a benchmark in our benchmark suite that provide a small insight into this when compiling a very trivial closest hit shader.

In SPIR-V we can patch the SPIR-V directly, compiling 4096 different rchit shaders takes around ~7ms

Note: this code-path uses our own patcher rather then specialization constants in the driver due to the underlying thing we're trying to measure. We're patching to actively avoid spec constants (in other scenarios we would absolutely use them).

In DirectX we need to re-invoke the compiler with a new define every time, for those 4096 rchit shaders which takes around ~13 seconds.

The shader itself is extremely simple (the benchmark doesn't focus on compilation speed, rather it tries to test coherency gathering).

#ifndef SHADER_VARIANT
[[vk::constant_id(0)]] const uint shader_variant = -1;
#define SHADER_VARIANT shader_variant
#endif

struct Payload {
    uint color; // single DWORD
};

struct Attribute {
    uint unused;
};

[shader("closesthit")] void main(inout Payload payload, in Attribute attribs) {
    payload.color = SHADER_VARIANT;
}

For a simple shader like this - a small 1800x speedup is nothing to sneeze at.

We have another test that does 16384 variants - there Dx12 takes ~50 second to compile permutations, while our SPIR-V patching code takes around 26ms.

Jasper-Bekkers avatar Nov 10 '21 22:11 Jasper-Bekkers

This would be extremely useful for http://github.com/godotengine/godot. We thoroughly rely on a ubershader + specialization constants model to ensure small shader caching and on-the-fly shader variant permutation. It works fantastic in Vulkan, but porting Godot to Direct3D without something like this is quite difficult.

This is important because, unlike AAA titles, Godot aims to be an easy to use and inclusive game engine. The idea is that everything works out of the box as well as possible. Specialization constants in Vulkan avoid having large shader compilation times for most shader permutations, which also ensures that users with very poor hardware can make games (without having to wait for a long time for the permutations to compile).

reduz avatar Mar 14 '22 14:03 reduz

I've developed a method that allows to have something not super far from native SCs, but less convenient than they would be: https://twitter.com/RandomPedroJ/status/1532725156623286272

RandomShaper avatar Jun 03 '22 14:06 RandomShaper