Movie-Review-Sentiment-Analysis-LSTM-Pytorch
Movie-Review-Sentiment-Analysis-LSTM-Pytorch copied to clipboard
There is a possible `keyError` while generating encoded review.
https://github.com/lukysummer/Movie-Review-Sentiment-Analysis-LSTM-Pytorch/blob/master/sentiment_analysis_LSTM.py#L38
encoded_reviews = [[vocab_to_int[word] for word in review] for review in all_reviews]
here the variable review
is a string. so [vocab_to_int[word] for word in review]
would generate error because in place of words, we would get letters.
Here is the explanation
So you will be trying to find index of each letter which will throw keyError.
I think it should be
encoded_reviews = [[vocab_to_int[word] for word in review.split()] for review in all_reviews]
Or you could modify your preprocess function Instead
def preprocess(text):
text = text.lower()
text = "".join([ch for ch in text if ch not in punctuation])
all_reviews = [i.split() for i in text.split("\n")]
text = " ".join(text)
all_words = text.split()
return all_reviews, all_words
so your reviews will look like
[
['hey', 'this', 'is' ,'review'].
['hey', 'this', 'is' ,'review2'],
.
.
.
]
def preprocess(text):
text = text.lower()
text = "".join([ch for ch in text if ch not in punctuation])
all_reviews = [i for i in text.split("\n")]
text = " ".join(text)
all_words = text.split()
return all_reviews, all_words
text = " ".join(text)
should be text = " ".join(all_reviews)