UER-py icon indicating copy to clipboard operation
UER-py copied to clipboard

cosine similarity为什么会出现负值?用sentence_transformers的util.cos_sim()调用uer/sbert-base-chinese-nli

Open peter65374 opened this issue 2 years ago • 0 comments

跑个unit test的时候例句得到负值的cosine similarity,这个是怎么回事?是util.cos_sim函数的问题么? image

        try:
            logger.info("START - 加载 Sen-SIMILARITY 模型")
            # model = SentenceTransformer('distiluse-base-multilingual-cased-v2')
            model = SentenceTransformer('uer/sbert-base-chinese-nli')  # uer model中文性能好很多。
            logger.info("FINISH - 加载 Sen-SIMILARITY 模型")
        except Exception as e:
            logger.warning("Exception thrown during Intialising pretrained model.", e)
       try:
            # Compute embedding for both lists
            embedding1 = model.encode(sentence1)
            embedding2 = model.encode(sentence2)

            #Compute cosine-similarities
            simcos = util.cos_sim(embedding1, embedding2)

            return simcos
        except Exception as e:
            logger.warning("Exception thrown during get similarity", e)
            return None`

peter65374 avatar Oct 27 '22 07:10 peter65374