cglm icon indicating copy to clipboard operation
cglm copied to clipboard

avx: AVX1 support for matrix inverse

Open recp opened this issue 7 years ago • 2 comments
trafficstars

cglm already supports AVX version for mat4_mul, but mat4_inv was missing. I implemented AVX1 version of matrix inverse.

After upgraded my Macbook Pro I'll try to implement AVX2 + FMA too, but since my current CPU does not support that, I can't do that for now.

I tested mat4_inv on Ivy Bridge CPU, I got similar performance with SSE (not better), but on new CPUs the result may be different. I'll try to reduce some shuffles later to increase performance.

New functions:

  • [x] glm_mat4_scale_avx(mat4 m, float s)
  • [x] glm_mat4_inv_avx(mat4 mat, mat4 dest)

These are selected automatically if -mavx is set.

I'll try to optimize SIMD-ed functions with SSE3 and SSE4 later.

recp avatar Oct 30 '18 07:10 recp

Coverage Status

Coverage remained the same at 11.487% when pulling 01b93b0409c637aae8f6bbcea7b0440ac5108d84 on simd into 07e60bd0981840ec183d6e2ed23415ba63168107 on master.

coveralls avatar Oct 30 '18 07:10 coveralls

glm_mat4_scale_avx() is added to master and I'll try to re-implement the AVX vesion

recp avatar Apr 30 '21 22:04 recp