nsimd
nsimd copied to clipboard
Agenium Scale vectorization library for CPUs and GPUs
Currently, only a small part of the basic `NSIMD` operators are implemented in the fixed_point module. However, most of the other operators like multiple loads/stores, zip/unzip or casts can be...
To test on ARM, I needed to cross-compile, and I decided to use Docker for this. This pull request contains the respective Dockerfiles and test scripts.
The intrinsic name `round_to_even` sounds strange. I assume that this is the usual "round to the nearest integer, break ties towards even numbers". However, the name of the intrinsic reads...
I'm trying to modify nsimd, and I find it difficult to get started since it's not obvious which files are autogenerated and which are not. It would be nice if...
BFloat16 are truncated standard float32, therefore - loads involves unpacks and - stores involves unzip This is OK for all supported architectures. Reference: .
First, size agnostic shuffles, such as reverse, unpack, zip, unzip, ... will be added. Then it seems that (to be confirmed) all supported and yet to be supported architecture by...