STL
STL copied to clipboard
Should `ranges::minmax` be auto-vectorized instead of manual vectorization.
I now see that manually-vectorizing ranges::minmax in #4384 was not necessarily a good idea.
Original implementation was implemented in terms of minmax_element and was not vectorized.
But if it was implemented directly, it would have been vectorized.
Consider the following:
unsigned int* max_element(unsigned int*b, unsigned int* e){
unsigned int m = *b;
unsigned int* r = b;
for (++b; b != e; ++b){
if (m < *b) {
m = *b;
r = b;
}
}
return b;
}
unsigned int max1(unsigned int*b, unsigned int* e){
return *max_element(b, e);
}
unsigned int max2(unsigned int *b, unsigned int* e){
unsigned int m = *b;
for (++b; b != e; ++b){
if (m < *b) {
m = *b;
}
}
return m;
}
max1 is not vectorized, but max2 is. see https://godbolt.org/z/fG6Yzfz4v
So we could just reimplement ranges::max_element in headers and enjoy vectorization with more context.
Question how should we proceed:
- Keep manual vectorization forever
- Drop it in vNext
- Drop it any time
We talked about this at the weekly maintainer meeting and we're in favor of implementing the ranges::minmax value-based family with dedicated code in the core header <__msvc_minmax.hpp> (currently only providing types), having <algorithm> use that, and also having vector_algorithms.cpp use that to implement _Minmax (because the import lib can never shrink until vNext).
We should check the benchmark to make sure that this doesn't regress performance significantly.
From #4734 there's a conclusion that floating point types without /fp:fast option should be vectorized with _element approach. This has to be kept.
The compiler doesn't generate the _element apporach, doesn't try to vectorize floating min/max at all, and currently even fails to apply the scalar approach properly, see DevCom-10686775.
As mentioned in #5010, it looks like keeping manual vectorization is best for now.