retriv icon indicating copy to clipboard operation
retriv copied to clipboard

HybridRetriever raise KeyError: -1 if the len of doc less than 1_000

Open tshu-w opened this issue 8 months ago • 1 comments

The cutoff of msearch for HybridRetriever is hardcode to 1_000, which makes map_internal_ids_to_original_ids raise KeyError when doc len less than 1_000 https://github.com/AmenRa/retriv/blob/c9baa011e3071c2369f81f5b6f3a87f0d444072d/retriv/hybrid_retriever.py#L254-L255

Thus, map_internal_ids_to_original_ids should be:

def map_internal_ids_to_original_ids(self, doc_ids: Iterable) -> List[str]:
    return [self.id_mapping[doc_id] for doc_id in doc_ids if doc_id != -1]

tshu-w avatar Oct 17 '23 11:10 tshu-w