STL icon indicating copy to clipboard operation
STL copied to clipboard

Could you explain why the _Hash class manages buckets using both lower and upper bounds?

Open m5623skhj opened this issue 7 months ago • 1 comments

First of all, I’d like to kindly ask for your understanding as I’m using a translation tool to write this message.

While looking into the _Hash class, I came across something I didn’t fully understand, so I’m reaching out with a question. From what I can tell, the _Hash class seems to manage buckets in the form of [2n, 2n + 1] I’m curious about the reasoning behind this design. Is there a specific optimization technique involved in this approach?

My question is, why are buckets configured in the form of 2 * n? And why is _Find_first() called if it is Multi, and _Find_last() called if it is not Multi in _Find()?

The reason I have this question is because I think that the bucket can sufficiently play its role even if it is not in the form of 2 * n but in the form of n.

Thank you very much to anyone who takes the time to read my question. :)

m5623skhj avatar May 12 '25 00:05 m5623skhj

why are buckets configured in the form of 2 * n?

I think you meant 2n. You can use 2<sup>n</sup> or $2^n$ to express it on GitHub.

I guess the intent was to allow using bitwise AND to calculate the remainder. E.g. https://github.com/microsoft/STL/blob/da7307d68dbf39a805372e91a740393cbf06595c/stl/inc/xhash#L858-L861

And why is _Find_first() called if it is Multi, and _Find_last() called if it is not Multi in _Find()?

When we use _Find_last, it's intentional to investigate the _Duplicate result of the result, which interacts with uniqueness of keys.

frederick-vs-ja avatar May 12 '25 02:05 frederick-vs-ja

We made this change many years ago, near the beginning of my career (I think it was in VS 2010 or 2012). As I recall, we changed from tracking buckets with a single iterator, to using "low" and "high" bounds, to improve the complexity of operations like erase(). Otherwise, we needed to iterate through all elements in the bucket in order to find the end of that bucket (and the beginning of the next one).

StephanTLavavej avatar May 14 '25 23:05 StephanTLavavej