EJ Park

Results 3 issues of EJ Park

We needed to adjust XNN_ALLOCATION_ALIGNMENT to 128 Byte for HVX and use predicate store for the tail part. Ctest passed but performance is not good yet. The next step naturally...

I had two new errors, so I updated the build recipe for Hexagon. 1. I found XNNPACK_ENABLE_RISCV_VECTOR=ON caused compilation error, so disabled this cmake variable. If you find any issue...

- Initial implementation and test added. - xnnpack/intrinsics-polyfill.h has the horizontal sum code (Q6_f32_vrsum_Vsf) using vshuff and vadd.