OpenICL icon indicating copy to clipboard operation
OpenICL copied to clipboard

TopKretriever similarity calculation

Open Jerry-723 opened this issue 8 months ago • 0 comments

https://github.com/Shark-NLP/OpenICL/blob/1613ae10b88ba2dbfed425c4ee078b2a6586152e/openicl/icl_retriever/icl_topk_retriever.py#L113

When using faiss for candidate example selection, the vector inner product distance is the closest. Why not use cosine similarity? The embedding is not normalized, and the calculation results using the inner product will be affected by the vector modulus. Should it be modified to:

res = self.model.encode(raw_text, show_progress_bar=False, normalize_embeddings=True)

Jerry-723 avatar May 02 '25 08:05 Jerry-723