TransformersSimilarityRanker (transformers_similarity.py) runtime error
Describe the bug The ranker fails regardless of its input.
Error message The following code is used to trigger the output, although any exemplar invocation will do the trick:
from haystack.components.rankers import TransformersSimilarityRanker
ranker = TransformersSimilarityRanker(model="sentence-transformers/all-MiniLM-L6-v2")
ranker.warm_up()
# retriever_output["documents"] contains a list of Document types
# question is a string with the query question
ranker.run(query=question, documents=retriever_output["documents"])
When executed:
Traceback (most recent call last):
File "xxxxxx", line 130, in <module>
ranked_output = ranker.run(query=question, documents=retriever_output["documents"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "xxxx/lib/python3.12/site-packages/haystack/components/rankers/transformers_similarity.py", line 268, in run
documents[i].score = similarity_scores[i]
~~~~~~~~~~~~~~~~~^^^
TypeError: list indices must be integers or slices, not list
When examined, the i value in this code is assigned to a list, with sorted_indices, from which i is assigned, being a list of lists.
Expected behavior
That i in the file would be a scalar.
Additional context Add any other context about the problem here, like document types / preprocessing steps / settings of reader etc.
To Reproduce
Run an invocation of the ranker with the following Pipfile:
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"
[packages]
haystack-ai = "*"
sentence-transformers = ">=2.2.0"
pypdf = "*"
mdit-plain = "*"
llama-cpp-python = "==0.2.56"
llama-cpp-haystack = "*"
accelerate = "*"
[dev-packages]
[requires]
python_version = "3.12"
FAQ Check
- [x] Have you had a look at our new FAQ page?
System:
- OS: Mac OS X Ventura 13.6.1
- GPU/CPU: CPU
- Haystack version (commit or version number): current
- DocumentStore: InMemoryDocumentStore (but does not matter)
- Reader: none
- Retriever: InMemoryEmbeddingRetriever (but does not matter)
Hi @kristapsdz-saic , I encountered the same issue, which seems to be caused by the model output structure. The model produces a 2D array as output. I attempted to use the Reranker model(refer below), and it worked for me without any errors. However, further debugging is required to resolve the issue completely.
ranker = TransformersSimilarityRanker(model="BAAI/bge-reranker-large")
Hey @kristapsdz-saic , @nvenkat94 is correct, this component only supports models with a Cross-Encoder architecture (which is the same as SequenceClassification in HuggingFace terms). Typically, models with reranker or cross-encoder in their name use this architecture and will be supported by this component.
The model provided in your original example "sentence-transformers/all-MiniLM-L6-v2" is an embedding model (or Bi-Encoder), which is not supported by this component. There is a nice explanation of the difference between Bi-Encoders and Cross-Encoders from Sentence Transformers here.