Support `min-max-inverted` norm
Background Introduction:
I need to perform a hybrid search using ranx and MyScaleDB. The principle is as follows:
- Use MyScale's
TextSearchto obtain theBM25document scores (higher scores indicate higher relevance). - Use MyScale's
VectorSearchto obtain the query scores for text vectors (the distance between vectors is calculated usingCosinesimilarity, so lower scores indicate higher relevance).
When merging the vector search results and the BM25 query results, I need to normalize the scores. However, I only found the min-max normalization function suitable for BM25 scores. Therefore, I modified the code and added the min-max-inverted function to handle the normalization of vector search scores.
❤️ 👀 @diegoceccarelli
I had a look with @diegoceccarelli and the main comment is that the metric has a lot of code shared with min-max . Couldn't you reuse the same code and add a parameter to change the way normalize results are computed? https://github.com/AmenRa/ranx/blob/master/ranx/normalization/min_max_norm.py#L29
@AmenRa what do you think?
@AndreP-git Following your advice, I have refined the code.
@MochiXu thanks for contributing - @AndreP-git and I added some suggestions to your branch - please let us know what you think!
https://github.com/MochiXu/ranx/pull/1/files
@diegoceccarelli Thank you so much for refining my code. Your improvements will make ranx more user-friendly.
Hi everyone, I am deeply sorry for the delay! Thanks @MochiXu for the contribution. Thanks @diegoceccarelli and @AndreP-git for the code review / refine. Merging.