libcudacxx [RFE] std::experimental::fixed_size

[RFE] std::experimental::fixed_size_simd and arithmetic operators

Open kerrmudgeon opened this issue 2 years ago • 0 comments

This requests std::experimental::fixed_size_simd and arithmetic operators to be added to libcu++.

This would result in a unified, portable exposure of operations that are accelerated by vector operations in NVIDIA GPU hardware, including:

elementwise multiply
elementwise add
elementwise multiply-add (e.g. half <= half * half + half, bfloat16 <= bfloat16 * bfloat16 + bfloat16).
convert-and-pack (int8 <= float, half <= float, bfloat16 <= float)
transcendental functions available in the CUDA Math Library (e.g. sqrt, exp, tanh, log, log10, erf, sin, cos, tan)

CUTLASS currently implements the above as partial specializations of operators and extensions. This is non-standard. CUTLASS would drop its implementation and use libcu++'s implementation when it becomes available.

Dec 14 '21 22:12 kerrmudgeon

libcudacxx libcudacxx copied to clipboard

[RFE] std::experimental::fixed_size_simd and arithmetic operators

libcudacxx
libcudacxx copied to clipboard