Keyword-BERT icon indicating copy to clipboard operation
Keyword-BERT copied to clipboard

代码中语法问题确认

Open EvelynZhaoShiMei opened this issue 3 years ago • 1 comments

你好,文件convert_to_bert_keyword.py文件中的match接口,有如下两点疑惑: 1、在调英文匹配的时候调用的仍是中文匹配接口 def match(s, kws): kw_index = set() for kw in kws: if re.match(r'^[\u4e00-\u9fff]+$', kw): kw_index |= set(match_ch(s, kw)) elif re.match(r'^[a-zA-Z]+$', kw): kw_index |= set(match_ch(s, kw)) #我的理解这里应该是用来做英文匹配的 else: continue return kw_index

2、在英文匹配接口里,字符串处理有问题 def match_en(s, kw): kw_index = [] for idx,e in enumerate(s): e.replace('#', '') #基于字符串对象是不可修改的,这里不重新赋值的话很可能是无效操作的 if e in kw: kw_index.append(idx) return kw_index

EvelynZhaoShiMei avatar May 29 '21 12:05 EvelynZhaoShiMei

请问你们的关键字提取算法(论文中的PMI+diff_idf)代码可以提供吗 BTW

EvelynZhaoShiMei avatar May 29 '21 12:05 EvelynZhaoShiMei