learning-nlp icon indicating copy to clipboard operation
learning-nlp copied to clipboard

第五章 实战提取文本关键词 使用的余弦相似度是不是写错了

Open GZDXGeorge opened this issue 2 years ago • 0 comments

1、原著:

   # 余弦相似度计算
        def calsim(l1, l2):
            a, b, c = 0.0, 0.0, 0.0
            for t1, t2 in zip(l1, l2):
                x1 = t1[1]
                x2 = t2[1]
                a += x1 * x1 #这里应该改为x1 * x2吧?
                b += x1 * x1
                c += x2 * x2
            sim = a / math.sqrt(b * c) if not (b * c) == 0.0 else 0.0
            return sim

2、网上找的公式

image

GZDXGeorge avatar Apr 15 '22 07:04 GZDXGeorge