xtensor
xtensor copied to clipboard
Using xtensors with booleans with xsimd: Compilation failure / inefficient code
The following simple test code does not compile when using -DXTENSOR_USE_XSIMD:
#include <xtensor/xfixed.hpp>
void xtensor_or(xt::xtensor_fixed<bool, xt::xshape<16>>& b1,
const xt::xtensor_fixed<bool, xt::xshape<16>>& b2) {
b1 = b1 | b2;
}
The compiler complains that it cannot convert an xsimd::batch_bool<int, xsimd::sse2>
to xsimd::batch<int, xsimd::sse2>
. Apparently xtensor and/or xsimd internally use these types for performing the actual computations.
When I change b1 | b2;
to xt::cast<bool>(b1 | b2);
the compiler will accept the code. The resulting assembly looks good: b1
and b2
are loaded from memory, then there's a single por
instruction, and b1
is stored again.
When I change the two bool
s to char
s or uint8_t
, the resulting assembly shows many pack and unpack instructions along with four por
instructions. This assembly confirms that xtensor/xsimd internally uses int
s instead of char
s. When I disable xsimd, the resulting assembly looks good again: No pack/unpack instructions and a single por
instruction. I'm suspecting that the xsimd_return_type
forces the conversion of char
and uint8_t
to integers, however, I couldn't pinpoint where it exactly happens.
I can understand why xsimd unpacks 8-bit booleans into 128-bit simd registers with 4 32-bit booleans. xsimd::batch_bool allows using booleans together with other types. However, this feature should not result in compilation errors. Also, operations that only use booleans should not have degraded performance.
Since this issue occurs when combining xtensor with xsimd, I am raising it here. Without xsimd, xtensor<bool, N>
works fine. Also, xsimd::batch_bool<bool>
works fine in isolation. Fixes may therefore be needed in both xtensor and xsimd.
Please have a look at this issue and investigate the compilation error at least.
After some further investigation I found the bitwise or operation (single |
) yields an int
even though the inputs are bool
. Using a logical or (double ||
) does yield a bool. The compilation error then disappears.
Any update on this issue? I spent way too long digging around in the error messages, they didn't give me a clue about what I was doing wrong. I was only one pipe character wrong.