NRCLex
NRCLex copied to clipboard
Whole new features
What's new:
- Read from an NRC lexicon txt, any formats, any languages
- Word by word lemmatization isn't enough, we need to expand the loaded lexicon instead
- Less dependencies
- Add
badge
- Negation support. Fix #4
Example
Text: she's always arguing!
New:
{'fear': 0.0,
'anger': 0.3333333333333333,
'anticipation': 0.0,
'trust': 0.3333333333333333,
'surprise': 0.0,
'positive': 0.0,
'negative': 0.3333333333333333,
'sadness': 0.0,
'disgust': 0.0,
'joy': 0.0}
Previous:
{'fear': 0.0,
'anger': 0.0,
'anticip': 0.0,
'trust': 0.0,
'surprise': 0.0,
'positive': 0.0,
'negative': 0.0,
'sadness': 0.0,
'disgust': 0.0,
'joy': 0.0}
To see all of the possible expansions
# use the old version
def expand_synonyms(lex):
from nltk.corpus import wordnet
lex_ = {}
for i in lex:
for j in wordnet.synsets(i):
for k in j.lemmas():
word = k.name().replace('_', ' ')
if not word in lex:
lex_[word] = lex[i]
return lex_
expand_synonyms(nrc.lexicon)