STL
STL copied to clipboard
<deque>: Needs a major performance overhaul
deque has long-standing performance problems, including but not limited to its choice of a very small block size. Unlike all other containers, deque::iterator always uses the proxy object, further adding to its complexity. When we can break binary compatibility in the vNext release, we need to eliminate deque's proxy (as we've already done for all other containers in debug mode, except vector<bool>), retune the block size, and generally rethink deque from scratch.
Also tracked by Microsoft-internal VSO-102760 / AB#102760.
vNext note: Resolving this issue will require breaking binary compatibility. We won't be able to accept pull requests for this issue until the vNext branch is available. See #169 for more information.
The current block size is tragically small: For more than two int32 values or for a single int64 the deque degenerates to a vector of pointers. Thus for all practical purposes, on MSVC deque IS a vector of pointers, which defeats its purpose of reducing allocator load and memory fragmentation.
https://github.com/microsoft/STL/blob/main/stl/inc/deque#L561
while GCC 4.6.3 allocates blocks of 512 bytes:
#define _GLIBCXX_DEQUE_BUF_SIZE 512
inline size_t
__deque_buf_size(size_t __size)
{ return (__size < _GLIBCXX_DEQUE_BUF_SIZE
? size_t(_GLIBCXX_DEQUE_BUF_SIZE / __size) : size_t(1)); }
https://gcc.gnu.org/onlinedocs/gcc-4.6.3/libstdc++/api/a01049_source.html
boost::deque uses a similar logic and block size as the glibc implementation above.
Currently, both checked and unchecked iterators of MSVC STL's deque add one more layer of indirection. It seems that we can add simpler iterator types that hold the _Mapptr and the offset before vNext and use them internally. I don't know whether we can directly change unchecked iterators now.
The tiny block size of MSVC's implementation is just utterly baffling. Whoever made that choice actively contributed to make the world that much harder for devs, and should be ashamed. Please fix this - in my limited knowledge I don't see how this could possibly be an ABI breaking change, but you had plenty choice to fix this even if it was, and you didn't. For now, I gotta keep using my own implementation, or just accept that windows users get a less optimised experience.