xsimd icon indicating copy to clipboard operation
xsimd copied to clipboard

Unaligned access

Open alexzk1 opened this issue 3 years ago • 9 comments

This is working without xsimd:

auto another = xt::view(src, i - rows, xt::all());
        auto current = xt::view(src, i, xt::all());

        //output row view
        auto vr = xt::view(res, i, xt::all());
        vr = current / another - 1.;

With SIMD enabled it throws exception at last line vr = at function _mm256_loadu_pd on instruction VMOVUPD

If I change code to next, it seems OK:

auto another = xt::view(src, i + rows, xt::all());
        auto current = xt::view(src, i, xt::all());
        //output row view
        auto vr = xt::view(res, i, xt::all());
        auto tmp = another / current - 1.;
        std::copy(tmp.begin(), tmp.end(), vr.begin());

alexzk1 avatar Mar 24 '21 17:03 alexzk1

Upd: I think first version (with operator = ) works on small datasets yet, however fails on big which are close to use swap.

alexzk1 avatar Mar 24 '21 17:03 alexzk1

The last code snippet is equivalent to disabling xsimd, which explains why it works. Indeed, the line auto tmp = another / current - 1.; builds an unevaluated expression (no SIMD instruction involved). Then, you perform a copy via the iterators, which does not imply any SIMD instruction either. If you replace the copy with an assignment (vr = tmp), you should notice the same failure.

Can you paste the exception message you got? Also, what is the ratio dataset size / memory for which it starts to fail?

JohanMabille avatar Mar 24 '21 21:03 JohanMabille

I have 3 copy-pastes similar. 2 of them did crash periodically. 3rd - never. Also when I said "small data" that was hand-crafted while real data are from web some. So all that tables must have same dimensions at the end, so memory allocated seems the same (also 3rd is allocated at later steps where more RAM is used already total). Then ...can be because of NaNs? Default value is NaN, then if some data is missing it remains NaN there.

Error was just some runtime with long stack from operator = to instruction VMOVUPD which was pointed by disasm view.

alexzk1 avatar Mar 25 '21 03:03 alexzk1

More info: image

Used function:


 //this is work around  of opeartor = used before, as it was crashing with SIMD
    template <class ViewClass>
    void move_to_row(tensors::result_t& res, const size_t row, ViewClass&& src)
    {
        auto vr = xt::view(res, row, xt::all());
        {
            const auto s1 = containers::countof(vr);
            const auto s2 = containers::countof(src);
            if (s1 != s2)
                FAILED("Supplied different sized to move_to_row!");
        }
        //std::move(src.begin(), src.end(), vr.begin());
        vr = src;
    }

s1=s2=1800 of doubles there.

alexzk1 avatar Mar 25 '21 17:03 alexzk1

If I do this: `auto&& e = xt::eval(src);

that fails here, which is original current / another - 1.

alexzk1 avatar Mar 25 '21 17:03 alexzk1

One more try to split it image

alexzk1 avatar Mar 25 '21 17:03 alexzk1

Maybe wrong pointer? Not sure ... image

alexzk1 avatar Mar 25 '21 17:03 alexzk1

More to go, fails:

using tensor_vector_t = xt::xtensor<value_t, 1>;
const tensor_vector_t another = xt::eval(xt::view(src, i + rows, xt::all()));

Where src is slice of cube on constant field:

return xt::view(tensor, field, xt::all(), xt::all());

alexzk1 avatar Mar 25 '21 18:03 alexzk1

Interesting, this removes error, when I do eval slice:

const auto src = xt::eval(parsed.viewByField(1));

which evals return xt::view(tensor, field, xt::all(), xt::all());

That has sence, in crafted tests not all fields are present in that cube. So it is smaller. P.S. Better to say "tower" instead cube, as it may have different "height".

alexzk1 avatar Mar 25 '21 18:03 alexzk1

This is not within xsimd's scope, please open an issue in https://github.com/xtensor-stack/xtensor if you're still having issues with this.

serge-sans-paille avatar Mar 06 '23 20:03 serge-sans-paille