overview-and-benchmark-of-traditional-and-deep-learning-models-in-text-classification icon indicating copy to clipboard operation
overview-and-benchmark-of-traditional-and-deep-learning-models-in-text-classification copied to clipboard

Dataset can't be loaded.

Open guillaume-chevalier opened this issue 5 years ago • 1 comments

I've tried to download the dataset, but it seems impossible to download. I went from your recent article: https://ahmedbesbes.com/overview-and-benchmark-of-traditional-and-deep-learning-models-in-text-classification.html To this: http://thinknook.com/twitter-sentiment-analysis-training-corpus-dataset-2012-09-22/ To then this: http://www.sananalytics.com/lab/twitter-sentiment/ However the last link of sananalytics.com doesn't load at all.

Or else, I try to download the data from your previous blog post: https://ahmedbesbes.com/sentiment-analysis-on-twitter-using-word2vec-and-keras.html I've tried to download the dataset from the Google Drive, but it seems erroneous. First, I copied your def ingest(): method. Then, I tried. first it didn't load: had the change the encoding to latin-1. Then, I got this and I realized the dataset had no columns. I had the error: ValueError: labels ['ItemID' 'SentimentSource'] not contained in axis, and it was on this line: data.drop(['ItemID', 'SentimentSource'], axis=1, inplace=True).

I wonder how I would be able to reproduce your experiments or at least use the same data for a quick comparison. I didn't tried further than what I've put above. I guess adding names to the columns manually might do it, but from this point on I suspect that probably other things wouldn't work as expected too down the road. It'd be very cool if you could an easy data loading pipeline.

Thanks!

guillaume-chevalier avatar Sep 20 '18 02:09 guillaume-chevalier

Hello Guillaume,

Here's the link to download the dataset: http://thinknook.com/wp-content/uploads/2012/09/Sentiment-Analysis-Dataset.zip

Ahmed,

ahmedbesbes avatar Sep 20 '18 08:09 ahmedbesbes