branchless-utf8 icon indicating copy to clipboard operation
branchless-utf8 copied to clipboard

Aliasing optimization

Open churchofthought opened this issue 8 years ago • 1 comments

https://en.m.wikipedia.org/wiki/Restrict

GCC's and Clang's restrict will further let the compiler optimize the code. Can you rerun your tests? Should give at least another 10% perrformance boost.

churchofthought avatar Oct 07 '17 21:10 churchofthought

This is a good idea, and I didn't think to try it myself. However, both GCC 6.3.0 and Clang 3.8.1 already figure out on their own that there is no aliasing and produce identical binaries regardless of restrict / restrict. My speculation is that when the function gets inlined, it can see the extent of all objects in play at once and prove there is no aliasing. That's the magic of header libraries and inlining!

Are you seeing different code generation when you add restrict? If so, what compiler and version are you using? You can literally just sha1sum your two binaries, with and without restrict, to see there is no difference.

The non-inlined version of utf8_decode() does change in the presence of restrict, and in other circumstances it might be worth using. Remove "static" from the definition and you'll see a difference. It just doesn't matter for this benchmark, at least with recent compilers.

skeeto avatar Oct 08 '17 02:10 skeeto