OpenICL
OpenICL copied to clipboard
TopKretriever similarity calculation
https://github.com/Shark-NLP/OpenICL/blob/1613ae10b88ba2dbfed425c4ee078b2a6586152e/openicl/icl_retriever/icl_topk_retriever.py#L113
When using faiss for candidate example selection, the vector inner product distance is the closest. Why not use cosine similarity? The embedding is not normalized, and the calculation results using the inner product will be affected by the vector modulus. Should it be modified to:
res = self.model.encode(raw_text, show_progress_bar=False, normalize_embeddings=True)