sentometrics
sentometrics copied to clipboard
Denominator for "proportionalPol" sentiment computation
It always tickles me that compute_sentiment can yield values outside the [-1;1] range when using the "proportionalPol" method.
library(sentometrics)
sample_text <- setNames(nm = c("C'est un abandon", "C'est un vaste abandon"))
compute_sentiment(
sample_text,
lexicons = sentometrics::sento_lexicons(list(LM_sample = head(sentometrics::list_lexicons$LM_fr_tr, 5)),
list_valence_shifters$fr),
how = "proportionalPol"
)
#> id word_count LM_sample
#> 1: C'est un abandon 3 -1.0
#> 2: C'est un vaste abandon 4 -1.8
Created on 2021-12-03 by the reprex package (v2.0.0)
With this change, I adjusted the denominator so that it takes into account sentiment words that have been amplified-deamplified. Thus, the sentiment will always lie down within the [-1;1] interval. The same example after the change:
library(sentometrics)
sample_text <- setNames(nm = c("C'est un abandon", "C'est un vaste abandon"))
compute_sentiment(
sample_text,
lexicons = sentometrics::sento_lexicons(list(LM_sample = head(sentometrics::list_lexicons$LM_fr_tr, 5)),
list_valence_shifters$fr),
how = "proportionalPol"
)
#> id word_count LM_sample
#> 1: C'est un abandon 3 -1
#> 2: C'est un vaste abandon 4 -1
Created on 2021-12-03 by the reprex package (v2.0.0)
I think it's better to add a new weighting option than to change one which has been around for many versions and is implemented correctly. In the end it's just a choice and a matter of preference.
I would thus add a new weighting scheme, for instance called proportionalPolNorm
, and clearly document it. It should also be added to the get_hows()
function.