rank_llm
rank_llm copied to clipboard
Added Support for Bge-Reranker-v2 into RankLLM
Pull Request Checklist
Reference Issue
ref: N/A
Checklist Items
Before submitting your pull request, please review these items:
- [] Have you followed the contributing guidelines?
- [Y] Have you verified that there are no existing Pull Requests for the same update/change?
- [] Have you updated any relevant documentation or added new tests where needed?
PR Type
- Type: Feature
- Description: Adds support for the bge-reranker-v2 models on Hugging Face for pointwise reranking via rank_llm. All bge supported models are: BAAI/bge-reranker-base, BAAI/bge-reranker-large, BAAI/bge-reranker-v2-m3, BAAI/bge-reranker-v2-gemma, and BAAI/bge-reranker-v2-minicpm-layerwise. Now the above bge models can be run like any other pointwise reranker on rank_llm.
Documentation
Dependencies
Aside from rank_llm's general setup, install the following:
pip install -U FlagEmbedding
Running bge
We can run bge with a simple command in the rank_llm directory as follows:
# if you want to remove progress bars, pass env var TQDM_DISABLE=1
python src/rank_llm/scripts/run_rank_llm.py --model_path=insert_model_name_on_hf --dataset=insert_dataset_path_or_name --retrieval_method=_insert_retrieval_method --prompt_mode=bge-reranker-v2 --batch_size=insert_batch_size --context_size=Insert_context_size
Tests
Here are some tests on the Deep Learning 2019 dataset, just to make sure things work.
# base
python src/rank_llm/scripts/run_rank_llm.py --model_path=BAAI/bge-reranker-base --dataset=dl19 --retrieval_method=bm25 --prompt_mode=bge-reranker-v2
# large
python src/rank_llm/scripts/run_rank_llm.py --model_path=BAAI/bge-reranker-large --dataset=dl19 --retrieval_method=bm25 --prompt_mode=bge-reranker-v2
# m3
python src/rank_llm/scripts/run_rank_llm.py --model_path=BAAI/bge-reranker-v2-m3 --dataset=dl19 --retrieval_method=bm25 --prompt_mode=bge-reranker-v2
# gemma
python src/rank_llm/scripts/run_rank_llm.py --model_path=BAAI/bge-reranker-v2-gemma --dataset=dl19 --retrieval_method=bm25 --prompt_mode=bge-reranker-v2
# minicpm-layerwise
python src/rank_llm/scripts/run_rank_llm.py --model_path=BAAI/bge-reranker-v2-minicpm-layerwise --dataset=dl19 --retrieval_method=bm25 --prompt_mode=bge-reranker-v2