xtensor icon indicating copy to clipboard operation
xtensor copied to clipboard

Avoid implicit conversions for bitwise operators

Open mnijhuis-tos opened this issue 1 year ago • 2 comments

This PR partially addresses #2707.

When using XSimd, the following code will compile, however, the resulting code is inefficient if the return type of operator|(const char&, const char&) is int instead of char. [C++ allows implicitly converting/promoting char to int.] (https://en.cppreference.com/w/cpp/language/implicit_conversion).

#include <xtensor/xfixed.hpp>

void xtensor_or(xt::xtensor_fixed<char, xt::xshape<16>>& b1,
                const xt::xtensor_fixed<char, xt::xshape<16>>& b2) {
    b1 |= b2;    
}

The generated assembly code performs the operation using 32-bit integers and contains many pack and unpack instructions around it. Supplying explict return types in the functors instead of 'auto' avoids those pack and unpack instructions and has a single 'vpor' instruction (when using avx2) that or's all 16 values at once.

Although this improvement works for small integral values like char and unsigned short, using bool still results in inefficient code since xt_simd::simd_condition<bool, bool>::value is false. xsimd::detail::simd_condition explicitly is false when both types are bool, for unknown reasons.

I have only applied the improvement to bitwise operators, since it's probably not safe for other operators. For example, adding characters may overflow so there the implicit conversion to int is useful. On the other side, adding ints themselves could also overflow and they are not promoted to other types. Perhaps this optimization is possible for all operators.

mnijhuis-tos avatar Jun 02 '23 09:06 mnijhuis-tos

Thanks. This is not my expertise, but I do already have some questions.

tdegeus avatar Jun 14 '23 08:06 tdegeus

@tdegeus the checks on this appear to be stuck burning up compute time.

spectre-ns avatar Nov 12 '23 17:11 spectre-ns