DirectXShaderCompiler icon indicating copy to clipboard operation
DirectXShaderCompiler copied to clipboard

Support IMul/UMul/UDiv with two outputs from HLSL

Open tex3d opened this issue 5 years ago • 3 comments
trafficstars

IMul/UMul/UDiv ops with two outputs have been part of the shader models since 4.0, but HLSL has never explicitly exposed these to HLSL so they could be fully used.

This is the suggestion that we should add new intrinsic functions to HLSL that map to these DXIL ops, which could open up scenarios without requiring the optional 64-bit support to be used.

tex3d avatar Apr 10 '20 00:04 tex3d

The UMul instruction, in particular, is useful for implementing the Philox-4x32 pseudorandom number generator. This is one of the fastest PRNGs available for GPUs, since it is non-sequential, and mulhi/mullo (the two outputs of UMul) are at at the core of the Philox algorithm. Philox is commonly used for hardware-accelerated machine learning frameworks to initialize random weights in a model.

This instruction is exposed as umulExtended in GLSL.

jstoecker avatar Apr 10 '20 00:04 jstoecker

@tex3d Needing this even more now, after having implemented several int64 operators in DirectML. The equivalent emulation is very verbose.

fdwr avatar Dec 03 '21 06:12 fdwr

+1

oscarbg avatar Aug 19 '22 11:08 oscarbg