xsimd
xsimd copied to clipboard
Unaligned access
This is working without xsimd:
auto another = xt::view(src, i - rows, xt::all());
auto current = xt::view(src, i, xt::all());
//output row view
auto vr = xt::view(res, i, xt::all());
vr = current / another - 1.;
With SIMD enabled it throws exception at last line vr = at function _mm256_loadu_pd on instruction VMOVUPD
If I change code to next, it seems OK:
auto another = xt::view(src, i + rows, xt::all());
auto current = xt::view(src, i, xt::all());
//output row view
auto vr = xt::view(res, i, xt::all());
auto tmp = another / current - 1.;
std::copy(tmp.begin(), tmp.end(), vr.begin());
Upd: I think first version (with operator = ) works on small datasets yet, however fails on big which are close to use swap.
The last code snippet is equivalent to disabling xsimd
, which explains why it works. Indeed, the line auto tmp = another / current - 1.;
builds an unevaluated expression (no SIMD instruction involved). Then, you perform a copy via the iterators, which does not imply any SIMD instruction either. If you replace the copy with an assignment (vr = tmp
), you should notice the same failure.
Can you paste the exception message you got? Also, what is the ratio dataset size / memory for which it starts to fail?
I have 3 copy-pastes similar. 2 of them did crash periodically. 3rd - never. Also when I said "small data" that was hand-crafted while real data are from web some. So all that tables must have same dimensions at the end, so memory allocated seems the same (also 3rd is allocated at later steps where more RAM is used already total). Then ...can be because of NaNs? Default value is NaN, then if some data is missing it remains NaN there.
Error was just some runtime with long stack from operator = to instruction VMOVUPD which was pointed by disasm view.
More info:
Used function:
//this is work around of opeartor = used before, as it was crashing with SIMD
template <class ViewClass>
void move_to_row(tensors::result_t& res, const size_t row, ViewClass&& src)
{
auto vr = xt::view(res, row, xt::all());
{
const auto s1 = containers::countof(vr);
const auto s2 = containers::countof(src);
if (s1 != s2)
FAILED("Supplied different sized to move_to_row!");
}
//std::move(src.begin(), src.end(), vr.begin());
vr = src;
}
s1=s2=1800 of doubles there.
If I do this: `auto&& e = xt::eval(src);
that fails here, which is original current / another - 1.
One more try to split it
Maybe wrong pointer? Not sure ...
More to go, fails:
using tensor_vector_t = xt::xtensor<value_t, 1>;
const tensor_vector_t another = xt::eval(xt::view(src, i + rows, xt::all()));
Where src is slice of cube on constant field:
return xt::view(tensor, field, xt::all(), xt::all());
Interesting, this removes error, when I do eval slice:
const auto src = xt::eval(parsed.viewByField(1));
which evals
return xt::view(tensor, field, xt::all(), xt::all());
That has sence, in crafted tests not all fields are present in that cube. So it is smaller. P.S. Better to say "tower" instead cube, as it may have different "height".
This is not within xsimd's scope, please open an issue in https://github.com/xtensor-stack/xtensor if you're still having issues with this.