timeout
timeout copied to clipboard
Use a faster safe bitwise rotate implementation.
This branch replaces the rotation operation with one that compiles down to a single branchless rotate instruction (on my gcc and clang compilers). The magic trick here is the fact that :
(v >> (c&mask)) | (v << (-c&mask))
works even when c==0, is conformant c, doesn't require a branch, and many compilers can recognize it and replace it with a rotate instruction.
I have tested this, but I haven't benchmarked it yet or tested with super-old or super-weird compilers. I would assume that eliminating these branch instructions is always a good idea, though, and it doesn't make the timeout.c code uglier.
Hm, we'd better be careful here. I just ran the benchmarks over master and all 3 of these branches, and to my surprise they don't show a very large improvement. (I wasn't very careful in my setup, though, so maybe better benchmark run would show results better.)
Yikes. That's alot of code for something simple. I'd rather hold off on this. I was thinking it was a small diff that reduced the overall SLoC.
Regarding branching, I recently stumbled upon the very enlightening paper, Branch Prediction and the Performance of Interpreters - Don't Trust Folklore. The takeaway is that predictors have been getting much better across the board. Haswell is so good at it that often times it's wasted effort to even think about branch optimization.
Regardless of performance, I do prefer removing conditionals as removing them them tends to make code easier to follow and manage overall, even if the details of why a condition is unneeded isn't obvious. But this patch adds more (preprocessor) conditionals and other code.
Is it okay if I cherry pick (e.g. moving rotr and rotl to timeout-bitops) so I can merge the Makefile and regression test patches?
Of course it's okay to cherry-pick. :)