Vc
Vc copied to clipboard
Implement ARM Neon support for Vc
- [x] branch off mkretz/datapar
- [x] go through the unmerged commits on mkretz/NEON (https://github.com/VcDevel/Vc/compare/mkretz/NEON) and make a list of commits that should be merged to mkretz/datapar and/or your NEON branch (let @mkretz review)
- [x] git cherry-pick the chosen commits to be merged to mkretz/datapar
- [x] adjust to buildsystem and datapar architecture changes (aka. make it compile & pass)
- [x] pull request
- [x] cherry pick the remaining commits from mkretz/NEON
- [ ] TBD
@mattkretz, I rechecked list and I think all commits are useful or partially useful. I checked merge conflicts and they are affecting mostly build system - travis.yml, CMakeList.txt, OptimizeforArchitecture.txt, test/CMakeLists.txt.
Can you looks please? Or better to create a list?
Maybe I can try to merge in my branch (branched off from mkretz/datapar) and then after pull when all conflicts will be solved?
Maybe I can try to merge in my branch (branched off from mkretz/datapar) and then after pull when all conflicts will be solved?
You should give it a try. You should be able to resolve some of the cherry pick conflicts yourself. Give me the ones that are too messed up for you to resolve.
What do you think about organizing our work on https://github.com/VcDevel/Vc/projects/3? Makes it easy for you to put subtasks on my to list - and for me to see where you are.
Anyway, have you thought about where the commits should be cherry-picked? Any commits that should go to mkretz/datapar ASAP?
I checked how works a merge from mkretz/NEON to mkretz/datapar and already solved all conflicts, compilation of course fails, but it will be a next step. Will try to push to my branch, and currently writing you a list to cherrypick in mkrets/datapar.
I would like to ask you a question is it make sense to focus on support of all architectures: arm|aarch32|aarch64? Currently I have access to "aarch64", and can try to find old "arm"..
a merge from mkretz/NEON to mkretz/datapar
Wait, did you do git merge mkretz/NEON
then? That's not what I intended. I wanted to have the NEON commits rebased onto mkretz/datapar. Either via git cherry-pick
or via git rebase -i
.
Edit: Why? Because it'll be very hard (aka impossible) to rebase mkretz/datapar onto master after a merge of NEON.
support of all architectures?
Focus on aarch64 for now. It would be nice to support them all at some point. But it probably doesn't make sense for you at this point.
@mattkretz , my bad, you are totally right. Anyway it was a good practice :) Is it make sense maybe to add to datapar next commits based on detection of system Vc_x86 and Vc_ARM:
https://github.com/VcDevel/Vc/commit/08efe94c58799caf78a085ff5614f2db265b5038 https://github.com/VcDevel/Vc/commit/e6b86f827cbeee56994c0e7491aa6a202583a96e
I agree those two commits should go to mkretz/datapar. They'll not apply without a bit of extra work, but you should give it a shot.
@mattkretz , I added also two more commits to be cherry-picked:
6cdad3a3fd5ade362fd315951b52b0fc393b8bf5 ccee375b184d648cbdfcdda647fded4aac2b716d 08efe94c58799caf78a085ff5614f2db265b5038 e6b86f827cbeee56994c0e7491aa6a202583a96e
Looks about right, yes. 6cdad3a3fd5ade362fd315951b52b0fc393b8bf5 might not be important, though. I'm trying to fade out the use of the Vc_IMPL macro in mkretz/datapar
. Short summary: there are two somewhat orthogonal relations to the target hardware specifics:
- The ABI. This is (since Vc 1.0), the second template parameter to
Vc::Vector
/Vc::datapar
. It relates to the registers that are used to pass objects of typedatapar
around. TheVc_IMPL
macro had an influence on the default ABI parameter. - The ISA. This determines the instructions the compiler may emit when generating machine code. It can only be influenced for certain via compiler flags (if we ignore inline asm). The
Vc_IMPL
macro did exclude the use of some intrinsics, but that never really stopped the compiler's optimizer from transforming the code to use the more advanced instructions anyway.
Take a look at datapar/macros.h
for the new macros.
@oshadura Oh, and please don't push to my branches. If you think something should be merged to my branch please open a pull request instead.
@mattkretz , sorry for this!!! Should I revert? I finally rebased onto mkretz/datapar and start to look on differences of implementation.
Leave it for now. Shouldn't be a problem for me to rebase onto this time.
@mattkretz, my belated wishes and Happy New Year! I am making a progress, but currently stuck with definition which packing we would like to have implemented for NEON? For example in SSE there are _m128, _m128i, _m128d, for NEON there are a lot of different "packings" (including half-wide) ~ 18?
@mattkretz, one more question about supportfunctions test, maybe we need to add separate header for ARM, like include/Vc/cpuid.h? In "Cortex-A Series Programmer's guide in Section 20.1.7 Detecting NEON" is used /proc/cpuinfo(text) and auxiliary vector /proc/self/auxv(binary) for reading and interpreting information about the CPU.
Thanks, a successful and happy new year to you as well!!
different "packings"
You can encode this packing in variations of the ABI tag type. E.g.
namespace datapar_abi {
struct neon {};
template <int N> struct partial_neon {};
}
In principle this can be done for SSE, too. Just that NEON can do it more efficiently because it has instruction support for it.
I recommend you ignore the partial_neon<N>
ABI for now, though. The partial ABI will likely not see many users.
about supportfunctions test, maybe we need to add separate header for ARM, like include/Vc/cpuid.h?
Ignore those for now. Focus on the core datapar implementation as specified by the latest C++ paper of mine.