azure-search-vector-samples icon indicating copy to clipboard operation
azure-search-vector-samples copied to clipboard

How to Calculate Search Scores and Rankings for Hybrid Searches

Open nohanaga opened this issue 2 years ago • 0 comments

Hi team, I am wondering how search scores and rankings are calculated for hybrid searches. I did a search with keyword search(top=10) and vector search(vector.k=10) and got the following ranking. RRFscore(d) = Σ (1 / (k + rank_d_i)) with constant k = 59.

1, Simple Vector Search Result

FileName Vector-Rank Vector-RRF
A.txt 2 1/(k+2)=0.016393443
B.txt 1 1/(k+1)=0.016666667
C.txt 3 1/(k+3)=0.016129032

2, Simple Keyword Search Result

FileName Keyword-Rank Keyword-RRF
A.txt 8 1/(k+8)=0.014925373
B.txt 10 1/(k+10)=0.014492754
C.txt None 1/(k+?)=None

3, Σ RRF

FileName calclated-RRF displayed-RRF
A.txt 1/(k+2)+1/(k+8)=0.031318816 ≈0.03131881356239319
B.txt 1/(k+1)+1/(k+10)=0.031159420 ≈0.031159421429038048
C.txt 1/(k+3)=0.016129032 ≠0.029286926612257957

Question 1. The docs describe k= like 60, but my calculation seems to be 59, how much is it actually?

Question 2. "C.txt" was not found in the Simple Keyword Search (top=10) results. How is the RRF calculated in this case? My guess is that internally a larger value than I specified in my top query is specified and its ranking is calculated.

nohanaga avatar Jun 21 '23 16:06 nohanaga