ranx
ranx copied to clipboard
Support `min-max-inverted` norm
Background Introduction:
I need to perform a hybrid search using ranx and MyScaleDB. The principle is as follows:
- Use MyScale's
TextSearch
to obtain theBM25
document scores (higher scores indicate higher relevance). - Use MyScale's
VectorSearch
to obtain the query scores for text vectors (the distance between vectors is calculated usingCosine
similarity, so lower scores indicate higher relevance).
When merging the vector search results and the BM25
query results, I need to normalize the scores. However, I only found the min-max
normalization function suitable for BM25
scores. Therefore, I modified the code and added the min-max-inverted
function to handle the normalization of vector search scores.
❤️ 👀 @diegoceccarelli
I had a look with @diegoceccarelli and the main comment is that the metric has a lot of code shared with min-max
. Couldn't you reuse the same code and add a parameter to change the way normalize results are computed? https://github.com/AmenRa/ranx/blob/master/ranx/normalization/min_max_norm.py#L29
@AmenRa what do you think?
@AndreP-git Following your advice, I have refined the code.
@MochiXu thanks for contributing - @AndreP-git and I added some suggestions to your branch - please let us know what you think!
https://github.com/MochiXu/ranx/pull/1/files
@diegoceccarelli Thank you so much for refining my code. Your improvements will make ranx more user-friendly.
Hi everyone, I am deeply sorry for the delay! Thanks @MochiXu for the contribution. Thanks @diegoceccarelli and @AndreP-git for the code review / refine. Merging.