azure-search-vector-samples
azure-search-vector-samples copied to clipboard
How to Calculate Search Scores and Rankings for Hybrid Searches
Hi team, I am wondering how search scores and rankings are calculated for hybrid searches. I did a search with keyword search(top=10) and vector search(vector.k=10) and got the following ranking. RRFscore(d) = Σ (1 / (k + rank_d_i)) with constant k = 59.
1, Simple Vector Search Result
| FileName | Vector-Rank | Vector-RRF |
|---|---|---|
| A.txt | 2 | 1/(k+2)=0.016393443 |
| B.txt | 1 | 1/(k+1)=0.016666667 |
| C.txt | 3 | 1/(k+3)=0.016129032 |
2, Simple Keyword Search Result
| FileName | Keyword-Rank | Keyword-RRF |
|---|---|---|
| A.txt | 8 | 1/(k+8)=0.014925373 |
| B.txt | 10 | 1/(k+10)=0.014492754 |
| C.txt | None | 1/(k+?)=None |
3, Σ RRF
| FileName | calclated-RRF | displayed-RRF |
|---|---|---|
| A.txt | 1/(k+2)+1/(k+8)=0.031318816 | ≈0.03131881356239319 |
| B.txt | 1/(k+1)+1/(k+10)=0.031159420 | ≈0.031159421429038048 |
| C.txt | 1/(k+3)=0.016129032 | ≠0.029286926612257957 |
Question 1. The docs describe k= like 60, but my calculation seems to be 59, how much is it actually?
Question 2. "C.txt" was not found in the Simple Keyword Search (top=10) results. How is the RRF calculated in this case? My guess is that internally a larger value than I specified in my top query is specified and its ranking is calculated.