ILGPU
ILGPU copied to clipboard
`fp8` and `bfloat16` support
NVidia actually has two variants of fp8
with different sizes of mantissa/exponent. bfloat16
is also unique. There's also TensorFloat32 which is really more like bfloat19
. Perhaps it would make sense to have float<SizeOfMantissa, SizeOfExponent>
generic type (hackery).
It looks like Cuda provides a few alternate floating point options, including bf16
and tf32
.
This would have to be a Cuda only feature, as there is no equivalent in OpenCL 2.0.
We already have support for Half
. Would we add support for BFloat16
and TensorFloat32
types? @m4rs-mt
@MoFtZ that's why it might make sense to have the generic-sized float
type as I mentioned, with boolean guards. E.g. bool Accelerator.SupportsType(...)
function, which would let user choose a different kernel.
@lostmsu @MoFtZ I think this makes sense to add given that NVIDIA GPUs can take serious advantage of these types. I wonder how we can make this happen in a convenient way. Let's get into more detail on Thursday in our talk-to-dev session.
Based on our last discussions, this is more broadly related to adding support for the Cuda WMMA (Warp Level Matrix Multiply-Accumulate Instructions); adding support for the fp8
and bfloat16
types is not very useful without WMMA support.