xsimd icon indicating copy to clipboard operation
xsimd copied to clipboard

C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))

Results 141 xsimd issues
Sort by recently updated
recently updated
newest added

There is a trick borrowed from Lemire that I am using to implement left shift when unavailable: multiply by 2^n. (The rest of the trick is that for right shifting,...

I've noticed that currently the `load_masked` and `store_masked` only supports `batch_bool_constant`. I think `load_masked` and `store_masked` is very suitable for dealing with loop tails, however in this case the mask...

@serge-sans-paille @JohanMabille this ideas works, but I cannot figure out how to refactor `bitwise_lshift_as_twice_larger` into a separate header. The issue: - `xsimd_sse2.hpp` needs `utils/shits.hpp` for `bitwise_lshift_as_twice_larger` - but `bitwise_lshift_as_twice_larger` needs...

This is the proposed Pixi workflow Tasks: - [x] Add pixi.toml - [x] Add CMakePresets - [x] Add usage to documentation - [x] Add sample usage in CI Some notes...

While adding too many utilities for `batch_constant` may not be a goal, I believe an additional utility to generate a `batch_constant` with increasing numbers from zeros could be an interesting...

Currently, I locally use [Pixi](https://pixi.sh) which is a modern project-oriented Conda environment manager, which I would like to contribute here. It can easily manage multiple environment and tasks (along with...

On main, running the tests give this failure. ``` [doctest] doctest version is "2.4.12" [doctest] run with "--help" for options =============================================================================== /Users/antoine/workspace/github.com/xtensor-stack/xsimd/test/test_error_gamma.cpp:156: TEST CASE: [error gamma] gamma /Users/antoine/workspace/github.com/xtensor-stack/xsimd/test/test_error_gamma.cpp:150: ERROR: CHECK_EQ(...

Some intrinsics for a size `N` are introduce in the same generation that introduces a register of size `2N. - `_mm_srlv_epi32` (128 bits) is introduced in Avx2, along with `_mm256`...

1. Adding stream API for non temporal data transfers 2. Adding xsimd::fence as a wrapper around std atomic for cache coherence 3. Adding tests ~~Draft because I need to double...