UER-py
UER-py copied to clipboard
cosine similarity为什么会出现负值?用sentence_transformers的util.cos_sim()调用uer/sbert-base-chinese-nli
跑个unit test的时候例句得到负值的cosine similarity,这个是怎么回事?是util.cos_sim函数的问题么?
try:
logger.info("START - 加载 Sen-SIMILARITY 模型")
# model = SentenceTransformer('distiluse-base-multilingual-cased-v2')
model = SentenceTransformer('uer/sbert-base-chinese-nli') # uer model中文性能好很多。
logger.info("FINISH - 加载 Sen-SIMILARITY 模型")
except Exception as e:
logger.warning("Exception thrown during Intialising pretrained model.", e)
try:
# Compute embedding for both lists
embedding1 = model.encode(sentence1)
embedding2 = model.encode(sentence2)
#Compute cosine-similarities
simcos = util.cos_sim(embedding1, embedding2)
return simcos
except Exception as e:
logger.warning("Exception thrown during get similarity", e)
return None`