sota_sentiment icon indicating copy to clipboard operation
sota_sentiment copied to clipboard

State-of-the-art model evaluation

Assessing State-of-the-art Sentiment Models on State-of-the-art Sentiment Datasets

Jeremy Barnes [[email protected]] / [[email protected]]

This experiment runs the best models with the best embeddings as described in the following paper:

Jeremy Barnes, Roman Klinger, and Sabine Schulte im Walde. 2017. Assessing State-of-the-art Sentiment Models on State-of-the-art Sentiment Datasets. In Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis.

Models

  1. Bag-of-Words + L2 regularized Logistic Regression
  2. Averaged Embeddings + L2 regularized Logistic Regression
  3. Retrofitted Embeddings + L2 regularized Logistic Regression
  4. max, min, ave Sentiment Embeddings + L2 regularized Logistic Regression
  5. LSTM
  6. BiLSTM
  7. CNN

Datasets

  1. Stanford Sentiment Treebank - fine-grained
  2. Stanford Sentiment Treebank - binary
  3. OpeNER
  4. SenTube Auto
  5. SenTube Tablets
  6. SemEval 2013 Task 2

Requirements

  1. Python 3
  2. tabulate pip install tabulate
  3. sklearn pip install -U scikit-learn
  4. Keras with Theano backend (could work with Tensorflow, but it hasn't been tested)
  5. H5py
  6. Twitter NLP (included)

Data you need

  1. Word embeddings (available here)
    • Download and unzip them in directory /sota_sentiment
  2. Datasets (provided)

Running the program

If you want to reproduce the best results for each model reported in the paper, simply clone the repository, download the word embeddings and unzip them, and run the experiment script:

git clone https://github.com/jbarnesspain/sota_sentiment.git
cd sota_sentiment
wget http://www2.ims.uni-stuttgart.de/data/sota_sentiment/embeddings.zip
unzip embeddings.zip
chmod +x sota_experiment.sh
./sota_experiment.sh

Output

the results will be printed to results/results.txt

the predictions will be kept in /predictions

Reference

@inproceedings{Barnes2017,
  author = {Barnes, Jeremy and Klinger, Roman and Schulte im Walde, Sabine},
  title = {Assessing State-of-the-Art Sentiment Models on State-of-the-Art Sentiment Datasets},
  booktitle = {Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis},
  year = {2017},
  address = {Copenhagen, Denmark}
}