Unexpect slowdown in basic example (provided)
Hello all,
I'm looking at integrating the library into a project I've working on.
However, I want to make sure that I set off on the right foot.
Thus, I have made a very simple minimum working example, using CMake, git submodules, and an old example I found lying around.
You can find the MWE here, which I will improve in responses to this thread.
However, I'm finding an ~15x slowdown using SIMD, which is not what I would expect.
Standard: 2 ms
SIMD: 37 ms
Before integrating, I want to make sure I avoid stumbling blocks such as this.
Does anyone have any insight into what's going on?
Cheers
I've updated my basic example also using Intel Instincts, with the same programming pattern.
Running ./simd_mwe 524288 100 on a test machine:
Standard: 90 ms
libsimdpp (4): 1569 ms
libsimdpp (8): 1428 ms
SSE Intrinsics (4): 55 ms
SSE Intrinsics (8): 28 ms
Is compiling with -march=native, and making the libsimdpp header available to the file not enough to use the correct instructions?