HarvestText icon indicating copy to clipboard operation
HarvestText copied to clipboard

英文文本效果如何?

Open SinclairCoder opened this issue 4 years ago • 1 comments

a nlp beginner, 最近在看无(弱)监督的情感分析,想知道这个项目对英文文本效果如何?

SinclairCoder avatar Mar 06 '20 17:03 SinclairCoder

本项目的情感分析算法基于SO-PMI,效果如何请参考原始Paper

本项目本来是专注于中文的,但是最近太多人问了我支持英文的问题了,所以更新了一下,加入了包括情感分析在内的少量英语支持,例子:

# ♪ "Until the Day" by JJ Lin
test_text = """
In the middle of the night. 
Lonely souls travel in time.
Familiar hearts start to entwine.
We imagine what we'll find, in another life.  
""".lower()

ht_eng = HarvestText(language="en")
sentences = ht_eng.cut_sentences(test_text)  # 分句
# 情感分析
sent_dict = ht_eng.build_sent_dict(sentences, pos_seeds=["familiar"], neg_seeds=["lonely"],
                                   min_times=1, stopwords={'in', 'to'})
print("sentiment analysis")
for sent0 in sentences:
    print(sent0, "%.3f" % ht_eng.analyse_sent(sent0))

# 情感分析也提供了一个内置英文词典资源
# from harvesttext.resources import get_english_senti_lexicon
# sent_lexicon = get_english_senti_lexicon()
# sent_dict = ht_eng.build_sent_dict(sentences, pos_seeds=sent_lexicon["pos"], neg_seeds=sent_lexicon["neg"])
# 然后仿照上面操作
sentiment analysis
in the middle of the night. 0.000
lonely souls travel in time. -1.600
familiar hearts start to entwine. 1.600
we imagine what we'll find, in another life. 0.000

其他支持的英文功能,见README

blmoistawinde avatar Mar 08 '20 07:03 blmoistawinde