sota_sentiment
sota_sentiment copied to clipboard
State-of-the-art model evaluation
Assessing State-of-the-art Sentiment Models on State-of-the-art Sentiment Datasets
Jeremy Barnes [[email protected]] / [[email protected]]
This experiment runs the best models with the best embeddings as described in the following paper:
Jeremy Barnes, Roman Klinger, and Sabine Schulte im Walde. 2017. Assessing State-of-the-art Sentiment Models on State-of-the-art Sentiment Datasets. In Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis.
Models
- Bag-of-Words + L2 regularized Logistic Regression
- Averaged Embeddings + L2 regularized Logistic Regression
- Retrofitted Embeddings + L2 regularized Logistic Regression
- max, min, ave Sentiment Embeddings + L2 regularized Logistic Regression
- LSTM
- BiLSTM
- CNN
Datasets
- Stanford Sentiment Treebank - fine-grained
- Stanford Sentiment Treebank - binary
- OpeNER
- SenTube Auto
- SenTube Tablets
- SemEval 2013 Task 2
Requirements
- Python 3
- tabulate
pip install tabulate - sklearn
pip install -U scikit-learn - Keras with Theano backend (could work with Tensorflow, but it hasn't been tested)
- H5py
- Twitter NLP (included)
Data you need
- Word embeddings (available here)
- Download and unzip them in directory /sota_sentiment
- Datasets (provided)
Running the program
If you want to reproduce the best results for each model reported in the paper, simply clone the repository, download the word embeddings and unzip them, and run the experiment script:
git clone https://github.com/jbarnesspain/sota_sentiment.git
cd sota_sentiment
wget http://www2.ims.uni-stuttgart.de/data/sota_sentiment/embeddings.zip
unzip embeddings.zip
chmod +x sota_experiment.sh
./sota_experiment.sh
Output
the results will be printed to results/results.txt
the predictions will be kept in /predictions
Reference
@inproceedings{Barnes2017,
author = {Barnes, Jeremy and Klinger, Roman and Schulte im Walde, Sabine},
title = {Assessing State-of-the-Art Sentiment Models on State-of-the-Art Sentiment Datasets},
booktitle = {Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis},
year = {2017},
address = {Copenhagen, Denmark}
}