We should make a cleanly-vectorizing fast-approximation for atan2f.
This article seems amazing reference:
https://mazzo.li/posts/vectorized-atan2.html
You may assign me, I think I'll do it. I think I'm seeing bad performance due to 8 calls to glibc's atan2f, instead of something that vectorizes cleanly.
Or this one, indeed: https://github.com/boulos/syrah/blob/4ac08d54daa09fc4e7ac8424898d21deda18e103/src/include/syrah/FixedVectorMath.h#L288-L348
Tagging zvookin because he's looked into doing this for some other similar cases (eg tanh)
Tagging @mcourteaux because nothing has happened -- assign to me instead if you don't have bandwidth
Assigning me is great! I just wanted to turn this idea into an issue, and have me assigned. It's still on my backlog. Will definitely get to this, but it's low priority right now. Somewhere in the coming month or two ideally.
PS: I cannot assign anyone. I don't have those permissions, it seems. 😢