tf-idf-keyword icon indicating copy to clipboard operation
tf-idf-keyword copied to clipboard

Keyword extraction based on TF-IDF on specific corpus. 基于特定语料库的TF-IDF的中文关键词提取

Results 4 tf-idf-keyword issues
Sort by recently updated
recently updated
newest added

请问我用自己的语料库得到的idf.txt是乱码的,该如何处理?

1. 怎样评价TF-IDF提取的关键词的好坏?具体用什么数学评价方法? 2. TF-IDF对用用户 电商评论数据效果如何? 有什么弊端吗? 谢谢,期待回复

Hello, 接觸這部分沒有很深, 請問idf的檔案是如何產生的呢?

``` def segment(sentence, cut_all=False): sentence = sentence.replace('\n', '').replace('\u3000', '').replace('\u00A0', '') sentence = ' '.join(jieba.cut(sentence, cut_all=cut_all)) return re.sub('[a-zA-Z0-9.。::,,))((!!??”“\"]', '', sentence).split() # 可以先替换,然后分词 ```