vaderSentiment icon indicating copy to clipboard operation
vaderSentiment copied to clipboard

Dictionary contains phrases like "fed up" that will never hit because of how the sentence is tokenized

Open kirsten-stallings opened this issue 4 years ago • 1 comments

The dictionary contains phrases like "fed up" but since the code checks if words are in the dictionary on a word by word basis, these phrases never hit:

> from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
>>> analyzer=SentimentIntensityAnalyzer()
>>> analyzer.polarity_scores("I am fed up")
{'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
>>>

kirsten-stallings avatar Feb 04 '21 19:02 kirsten-stallings

If I understand the code correctly, "fed up" (or any other multi-word phrases) should be removed from the lexicon.txt file and instead be added to the SENTIMENT_LADEN_IDIOMS, but the actual code for handling this seems to be a placeholder for a future addition.

I found a work-around for handling bigrams (2-word phrases) on Stack Overflow: https://stackoverflow.com/questions/67798527/nltk-vader-sentimentintensityanalyzer-bigram

ViennaMike avatar Nov 17 '22 18:11 ViennaMike