evals icon indicating copy to clipboard operation
evals copied to clipboard

Idea for Evals: Emotion and sentiment analysis Evals

Open Pabreetzio opened this issue 1 year ago • 6 comments

Understanding emotions and sentiments is an essential aspect of human communication. The system's ability to recognize these emotions enables more appropriate, context-aware, and empathetic responses. Emotions and sentiments can be complex and nuanced, requiring a system to perform fine-grained analysis and interpretation. Evaluating a system's performance in this area provides insights into its overall ability to process and understand subtleties in language.

There are several different emotions that Evals could be written for:

  • Sarcasm (already covered in #56 )
  • Positive sentiment (e.g., happiness, joy, excitement, gratitude)
  • Negative sentiment (e.g., anger, frustration, sadness, disappointment)
  • Neutral sentiment (e.g., indifference, objectivity, impartiality)
  • Fear (e.g., anxiety, apprehension, panic, terror)
  • Surprise (e.g., amazement, astonishment, bewilderment)
  • Love (e.g., affection, adoration, romance, passion)
  • Anticipation (e.g., eagerness, hope, curiosity)
  • Disgust (e.g., revulsion, repulsion, aversion, contempt)
  • Empathy (e.g., compassion, understanding, sympathy)
  • Confusion (e.g., perplexity, puzzlement, disorientation)
  • Pride (e.g., self-esteem, arrogance, satisfaction)
  • Envy (e.g., jealousy, covetousness, resentment)
  • Guilt (e.g., remorse, regret, shame)
  • Trust (e.g., confidence, faith, reliance)
  • Skepticism (e.g., doubt, disbelief, suspicion)

I suppose this could be covered in one Eval or many. If many, it might be good to come up with some sort of structure for organizing Evals for different sentiments or at least documenting what was covered already and what has yet to be covered. Sarcasm, for instance, was covered with an eval that contained news articles from the Onion. While article headlines are certainly one way of testing for sarcasm, others could be a dataset of tweets or reviews of products or examples from fiction.

Pabreetzio avatar Apr 11 '23 20:04 Pabreetzio