Halide
Halide copied to clipboard
Improvements for Halide floating-point math functions.
Halide provides multiple versions of routines found the libm math library such as transcendental functions. E.g. for exp there are:
- Plain
exp, which is intended to map to the platform support for e^x on the appropriate data type, generally float and double. halide_expwhich is implemented in Halide, supports vectorization, and is intended to be a consistent implementation with a good tradeoff between accuracy and performance.fast_expwhich is implemented in Halide, supports vectorization, and is intended to be optimized for speed with somewhat less accuracy and no support for NaNs an Infs.
Numerous improvements are needed:
- The above information should be expanded slightly and put into the Halide header documentation (IROperator.h) to make the differences between these functions clearer to users.
- The supported types and some indication of accuracy and range should be provided for the
halide_andfast_versions. The unadorned versions should be marked as not vectorized in most situations. - In
strict_floatmode, thehalide_versions should likely handle NaNs and Infs. It seems counter to the intention of thefast_versions to support NaNs and Infs. - We should consider providing access to vectorized math libraries such as Intel's MKL via some mechanism. Possibilities include a library that surfaces them, adding new functions to Halide's
IROperator.h, a target flag that retargets the existing unadorned function names to such a library, etc. - We should expand the set of functions supported. Specifically trigonometric and hyperbolic trigonometric functions, per the latter
tanhis used in ML activation implementation.
Specifics that (will) have subissues of their own:
- The
halide_andfast_operators typically only have support for 32-bit float. They are slower than they might be for 16-bit floating point and less accurate than they need to be for 64-bit floating point. - The
halide_routines should deliver accurate results for the full input range, in as much as best current knowledge allows without detrimental impact to efficiency. - New functions that need to be provided such as
sin,cos,tanh, etc.
Does halide_sin exist? I wasn't aware of it
Per Steven's comment, rewrote the issue to use exp instead of sin.