hnswlib icon indicating copy to clipboard operation
hnswlib copied to clipboard

Understanding pre-fetch logic

Open alonre24 opened this issue 2 years ago • 3 comments

Hi, I have a question regarding the usage of _mm_prefetch calls upon searching the graph. In searchBaseLayerST which is called from searchKnn, there is some pre-fetching of data from addresses that are going to be accessed in the upcoming iteration. One of these calls is: _mm_prefetch((char *) (visited_array + *(data + 1)), _MM_HINT_T0); that is fetching the entry in the visited_array of the first candidate to scan. However, the following pre-fetch call: _mm_prefetch((char *) (visited_array + *(data + 1) + 64), _MM_HINT_T0); for the next line of cache is not clear to me. What is the purpose of that?

Also, in searchBaseLayer which is called from addPoint, the pre-fetch calls are a bit different than what we see in searchBaseLayerST. One of the difference is that there is a call for: _mm_prefetch(getDataByInternalId(candidateSet.top().second), _MM_HINT_T0); to obtain the vector data of the candidate which is the current "best candidate". But why do we need its data, if we already have its distance computed?

I would appreciate the help, so I can understand if there is something that I'm missing here... Thanks!

alonre24 avatar Jul 31 '22 16:07 alonre24

Hi @alonre24 Thanks for the observation!

As far as I remember the prefetches during querying (searchBaseLayerST) were optimized by measuring the query latency and some of them might have only a homeopathic effect (~1% difference). I guess getting the next cache line should be handled by the hardware prefetcher so software prefetch is indeed useless. Prefetches during construction (searchBaseLayer) were copied from querying at some earlier stages and that it probably the reason why there are strange and useless prefetches and lack of more useful ones with some performance lost.

I wonder if you've measured any improvement from removing them?

yurymalkov avatar Aug 03 '22 05:08 yurymalkov

Thanks for the response, @yurymalkov ! In my benchmarks I noticed a minor improvement by adjusting the prefetch calls in searchBaseLayer, not something significant. Just wanted to make sure :)

alonre24 avatar Aug 04 '22 14:08 alonre24

Got it, makes sense! I guess the hnswlib should be updated as well, though doing proper testing of it is a bit of hustle.

yurymalkov avatar Aug 05 '22 23:08 yurymalkov