cglm icon indicating copy to clipboard operation
cglm copied to clipboard

WIP: More Optimizations and SIMD fixes for MSVC & ARM

Open recp opened this issue 1 year ago • 0 comments
trafficstars

  • [WIP] More SIMD optimizations
    • Matrix invert
    • Non-Square matrices
    • Transforms
    • AABB
    • Frustum
    • simd for int types
    • ...
  • [x] Fix compiling on MSVC + ARM32 ( dont align types on MSVC + ARM32 due to "719: formal parameter with requested alignment of 16 won't be aligned" )
  • [x] msvc, simd: fix simd headers for _M_ARM64EC
  • [x] arm, neon: fix neon support on GCC ARM
  • [ ] Try interleave independent instructions to take advantages of ILP if possible ( compilers may do this already but manually giving the hint is nice )
  • [ ] Try reduce port pressure where possible e.g. use some _mm_blend_ps instead lot of _mm_shuffle_ps ( this step may take a time also needs to be profiled e.g Intel VTune can be used to see the bottleneck + speed test... ). Maybe on another PRs...

recp avatar Apr 06 '24 22:04 recp