langchain icon indicating copy to clipboard operation
langchain copied to clipboard

BedrockRerank using newly available Cohere embedding model

Open austinmw opened this issue 1 year ago • 9 comments

Feature request

Hi, the new Cohere embedding models are now available on Amazon Bedrock. How can we use them for their reranking capability (instead of just embedding via BedrockEmbedding class)

Motivation

These models perform well for reranking

austinmw avatar Nov 18 '23 17:11 austinmw

🤖

Hello @austinmw! Great to see you again in the LangChain community. I'm diving into your issue about using the new Cohere embedding models on Amazon Bedrock for reranking. Expect a detailed response soon.

dosubot[bot] avatar Nov 18 '23 17:11 dosubot[bot]

🤖

Hi @austinmw, great to see you back on the LangChain repository! I appreciate your continuous interest and contributions.

To utilize the reranking capability of the new Cohere embedding models available on Amazon Bedrock in the LangChain framework, you would need to modify the _embedding_func method in the BedrockEmbeddings class.

Currently, this method is only set up to handle embedding tasks. To add reranking functionality, you would need to modify the input and output handling to match the requirements of the reranking task.

Here's an example of how you might modify the _embedding_func method to handle reranking:

def _reranking_func(self, query: str, documents: List[str]) -> List[float]:
    """Call out to Bedrock reranking endpoint."""
    # replace newlines, which can negatively affect performance.
    query = query.replace(os.linesep, " ")
    documents = [doc.replace(os.linesep, " ") for doc in documents]

    # format input body for provider
    provider = self.model_id.split(".")[0]
    _model_kwargs = self.model_kwargs or {}
    input_body = {**_model_kwargs}
    if provider == "cohere":
        if "input_type" not in input_body.keys():
            input_body["input_type"] = "rerank"
        input_body["query"] = query
        input_body["documents"] = documents
    else:
        # includes common provider == "amazon"
        input_body["inputText"] = query
        input_body["documents"] = documents
    body = json.dumps(input_body)

    try:
        # invoke bedrock API
        response = self.client.invoke_model(
            body=body,
            modelId=self.model_id,
            accept="application/json",
            contentType="application/json",
        )

        # format output based on provider
        response_body = json.loads(response.get("body").read())
        if provider == "cohere":
            return response_body.get("reranked_documents")
        else:
            # includes common provider == "amazon"
            return response_body.get("reranked_documents")
    except Exception as e:
        raise ValueError(f"Error raised by inference endpoint: {e}")

This function takes a query and a list of documents as input, and returns a list of reranked documents. The reranking is done by the Cohere model on the Bedrock platform.

Please note that this is a hypothetical example and the actual implementation may vary based on the specific requirements of the reranking task and the specific API of the Cohere model on the Bedrock platform.

Additionally, the LangChain framework does support reranking functionality. This is evident in the MapRerankDocumentsChain class in the map_rerank.py file. The MapRerankDocumentsChain class combines documents by mapping a chain over them and then reranking the results. This is done by calling an LLMChain on each input document. The LLMChain is expected to have an OutputParser that parses the result into both an answer (answer_key) and a score (rank_key). The answer with the highest score is then returned.

The reranking functionality is implemented in the _process_results method, where the results are sorted based on the score (rank_key), and the answer with the highest score is returned.

Here is the relevant code snippet:

def _process_results(
        self,
        docs: List[Document],
        results: Sequence[Union[str, List[str], Dict[str, str]]],
    ) -> Tuple[str, dict]:
        typed_results = cast(List[dict], results)
        sorted_res = sorted(
            zip(typed_results, docs), key=lambda x: -int(x[0][self.rank_key])
        )
        output, document = sorted_res[0]
        extra_info = {}
        if self.metadata_keys is not None:
            for key in self.metadata_keys:
                extra_info[key] = document.metadata[key]
        if self.return_intermediate_steps:
            extra_info["intermediate_steps"] = results
        return output[self.answer_key], extra_info

I hope this helps! Let me know if you have any other questions.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] avatar Nov 18 '23 17:11 dosubot[bot]

@austinmw Are you sure the reranking models are also available on Bedrock?

Cohere's webpage states that the rerank isn't available yet (https://cohere.com/deployment-options/aws):

image

dentroai avatar Feb 02 '24 20:02 dentroai

Hi @dentro-innovation, the "Cohere Rerank" model is not available on Bedrock, however, the latest Cohere embedding models (cohere-embed-english-v3.0 and cohere-embed-multilingual-v3.0) are both on the HuggingFace MTEB Reranking leaderboard (ranked 22 and 25, respectively) , so my assumption was that they could be configured for reranking use cases.

austinmw avatar Feb 02 '24 20:02 austinmw

I wasn't aware that you can do reranking with an Embedding model. Thought these two are different. But you're right, on the MTEB leaderboard for reranking there are embedding models listed.

Do you have any resource that explains the relationship of embedding and reranking models? Or can you explain it in your own words @austinmw ? Would greatly appreciate it as a web search and LLM question didn't really answer my confusion.

dentroai avatar Feb 02 '24 20:02 dentroai

I have some questions myself to be honest. My assumption is that you would:

  1. embed the query
  2. embed each retrieved document
  3. calculate the cosine similarity between the embedded query and each embedded document
  4. sort the retrieved documents according to these similarity scores

But it's possible I could be missing some intricacy of this model and how it should be used for reranking purposes. I posted a question on the Hugging Face model page.

austinmw avatar Feb 02 '24 21:02 austinmw

I have some questions myself to be honest. My assumption is that you would:

1. embed the query

2. embed each retrieved document

3. calculate the cosine similarity between the embedded query and each embedded document

4. sort the retrieved documents according to these similarity scores

But it's possible I could be missing some intricacy of this model and how it should be used for reranking purposes. I posted a question on the Hugging Face model page.

Did you get an answer to this question? It's still not super clear to me after reading the response you got on the HF model page.

We are trying to determine if the cohere ranker is simply generating embeddings of the query/document and calculating a cosine similarity or if it's doing something else. If it's the former we are looking to create an app which will do the same process and compare the results between different embeddings models.

drobbins-ancile avatar Apr 10 '24 20:04 drobbins-ancile

I have some questions myself to be honest. My assumption is that you would:

  1. embed the query
  2. embed each retrieved document
  3. calculate the cosine similarity between the embedded query and each embedded document
  4. sort the retrieved documents according to these similarity scores

My understanding there are two approaches, see cross-encoders-as-reranker

  1. Bi-Encoding: This is what is described above and is what is done when you use embeddings and a vector database to look up by cosine similarity.
  2. CrossEncoder: Here the question is paired with each of the retrieved results and passed into a sentence transformer which calculates the similarity. This is different from above and I think is what is implemented in the Cohere reranker

My guess is we need to add the bedrock cohere Reranker to the langchain CrossEncoderReranker

THe closest I could find to this is the (HuggingFaceCrossEncoder](https://python.langchain.com/docs/integrations/document_transformers/cross_encoder_reranker/) so we need a BedRockCohereCrossEncoder()

Kevin-McIsaac avatar Apr 22 '24 06:04 Kevin-McIsaac

@austinmw My understanding of cross encoder reranking(with LLM) is as below. You do not use the embedding model but the language model.

You send the question and each documents(already retrieved through the embedding model) to the language model. Ask it to answer "YES" or "NO". Get the logprob of the answer convert to probability and then rerank based on a thershold probability. If you use the embedding model for reranking, that is not different than just the semantic match. You can as well do a semantic match/cosine similarity using the embedding model. Why would we need to wrap that as a reranker?

rajib76 avatar May 15 '24 18:05 rajib76

First, we need Amazon to add the Sagemaker Cohere Reranker model to Bedrock (and support on demand per-token/??? pricing!). I for one really hope they do this, soon.

peebles avatar May 25 '24 20:05 peebles

First, we need Amazon to add the Sagemaker Cohere Reranker model to Bedrock (and support on demand per-token/??? pricing!). I for one really hope they do this, soon.

very good point! Current way of using cohere rerank model is not reasonable. Using sagemaker to deploy the rerank model cost instance price which is much more expensive than token price if you call cohere API purly.

applepieiris avatar Aug 15 '24 02:08 applepieiris

Hi, @austinmw. I'm Dosu, and I'm helping the LangChain team manage their backlog. I'm marking this issue as stale.

Issue Summary:

  • You requested guidance on using Cohere embedding models on Amazon Bedrock for reranking tasks.
  • I provided a response on modifying the BedrockEmbeddings class for reranking support.
  • You clarified that while the "Cohere Rerank" model isn't available, embedding models could be used for reranking.
  • The discussion included technical insights on embedding vs. reranking models from several contributors.

Next Steps:

  • Please confirm if this issue is still relevant to the latest version of the LangChain repository by commenting here.
  • If there is no further activity, the issue will be automatically closed in 7 days.

Thank you for your understanding and contribution!

dosubot[bot] avatar Nov 14 '24 16:11 dosubot[bot]