DirectXShaderCompiler
DirectXShaderCompiler copied to clipboard
Improving min16f reliability with constant buffer reads
(Filing this bug after internal discussions with Chris, Jesse and Tex)
Is your feature request related to a problem? Please describe. We have game devs reporting that it's a challenge to use min16f on PC today due to inconsistencies across drivers when loading min16f types from constant buffers.
According to Tex: "min precision types are always defined as being represented as 32-bit types in host-visible buffers, with down-conversion allowed by the driver whenever it could make sense."
However, some drivers appear to load 16-bit values when a min16f is used in a constant buffer, while others load 32-bit types. This is at least partially due to a lack of testing of the feature in HLKs, and possibly due to under-spec'ing of HLSL/DXIL. See this GDC talk for more info: https://schedule.gdconf.com/session/fp16-shaders-in-frostbite/899054
Describe the solution you'd like Ideally, all drivers would work in a consistent and predictable manner. Perhaps we can plug any spec holes + write more thorough tests.
If that's not possible, then perhaps we could add a mode to DXC that emits 32-bit cbuffer operations + downcast to 16-bit, instead of trusting the driver to do this right. Perhaps it could be a compile time option in DXC, or a pass that's run afterwards. A pass may be simplier but I believe it would cause PDB problems for tools like PIX.
Describe alternatives you've considered It's tempting to ask developers to not use min-precision types in cbuffers, but that isn't practical. Some developers define custom half types to (1) full 16-bit floats, or (2) min16f, or (3) 32-bit floats, depending on which platform they're targeting. They then use these custom half types extensively throughout their code on all platforms. They ship one set of shaders per platform.
If they had to avoid putting min16f in constant buffers, then the developers would have to:
- Have different half types depending on where they're used, and ask their developers to remember to use them correctly
- Do ugly things to "fix" 16-bit usage in cbuffers (e.g. pre or post processing)
- Ask their developers to write PC-specific code in their shaders whenever they're loading 16-bits types from cbuffers
Developers don't want this complexity in cross-platform code.