infinity
infinity copied to clipboard
Reranker API “top_k” Support
Feature request
Reran top_n is widely supported, for example here: https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/postprocessor/llama-index-postprocessor-jinaai-rerank/llama_index/postprocessor/jinaai_rerank/base.py
Suggest adding this feature. Thanks.
Motivation
This is a key feature in reranking at this moment.
Your contribution
I could submit a PR, please let me know what files need to be updated.
I think you are looking for rerankiner models? And yes, these are supported under the Rerank endpoint. View the Readme.md.
A top_k argument is not widley supported, and there are no openAI rerank specs.
https://docs.cohere.com/reference/rerank is supported - why would you need it? I personally see that you can do this in two lines of code client side.
https://docs.cohere.com/reference/rerank is supported - why would you need it? I personally see that you can do this in two lines of code client side.
Maybe when a 3rd party app does not support API that does not support selecting the top n, and we have no direct control over how to use the output from reranker.
@etwk Your instructions are not clear.
Please deploy, e.g. https://huggingface.co/mixedbread-ai/mxbai-rerank-xsmall-v1 via
infinity_emb v2 —model-id mixedbread-ai/mxbai-rerank-xsmall-v1
@aloababa
Can you align this implementation to one of the following API protocols?
https://docs.voyageai.com/reference/reranker-api {return_documents: bool = False} {.., documents: str}
https://jina.ai/reranker/ {return_documents: bool = False} -> {.., documents: str}
https://docs.cohere.com/reference/rerank {no_kwarg_for_documents} -> {document: {text: str}}
https://huggingface.github.io/text-embeddings-inference {return_text: bool = False} -> {.., text: str}