machine-learning-scripts icon indicating copy to clipboard operation
machine-learning-scripts copied to clipboard

Where can I get "imdb-train.pkl"?

Open k-terada opened this issue 5 years ago • 1 comments

I'm looking at pytorch-imdb-bert.py Where can I get "imdb-train.pkl"?

k-terada avatar Oct 10 '20 05:10 k-terada

The files imdb-train.pkl and imdb-test.pkl are just slightly processed versions of original data from http://ai.stanford.edu/~amaas/data/sentiment/ . You can get the sentences and polarity values from the original data.

train_df = pd.read_pickle("/media/data2/imdb/imdb-train.pkl")
print(train_df.sample(10))

                                                sentence sentiment  polarity
15135  Just a dumb old movie. First Stanwyck's son ge...         2         0
22916  A meteorite falls in the country of a small to...         7         1
20820  Whether it's a good movie or not, films of thi...         7         1
17389  The '60s is an occasionally entertaining film,...         2         0
20392  As I work at a video store, I found it to be m...         1         0
17671  Everyone in the cast, from Sugiyama to Aoki an...        10         1
16207  The only connection this movie has to horror i...         1         0
19790  The show had great episodes, this is not one o...         4         0
5569   I thought this film was just about perfect. Th...         9         1
21911  I sat through this film and i have to say it o...         1         0

jmakoske avatar Oct 12 '20 08:10 jmakoske