Faster Mask::firstOne()
This change removes the undefined behaviour on firstOne method when used on an empty mask.
I did that for 2 reasons:
- Undefined behaviour is not exactly a nice thing. And more important,
- Is faster than check beforehand if the mask is empty.
This change made a 5% speed improvement on my tests [1] on average. The code is a generic lower_bound using SIMD instructions and uses Vc.
I choose to return Mask::Size because will change the function return pattern just a little. Instead of return [0, Size), returns [0, Size]. One alternative is firstOne receive the default value for empty masks, I think the speedup should be the same.
[1] https://github.com/andrelrt/VcAlgo/blob/CodeMigration/include/VcAlgo/details/lower_bound.h
@andrelrt Can you please rebase this on the current branch, so that it can go through the CI? I'd like to get through all the current merge requests and merge what makes sense before we make a new release. Thank you!