Movie-Review-Sentiment-Analysis-LSTM-Pytorch icon indicating copy to clipboard operation
Movie-Review-Sentiment-Analysis-LSTM-Pytorch copied to clipboard

There is a possible `keyError` while generating encoded review.

Open superryeti opened this issue 5 years ago • 1 comments

https://github.com/lukysummer/Movie-Review-Sentiment-Analysis-LSTM-Pytorch/blob/master/sentiment_analysis_LSTM.py#L38

encoded_reviews = [[vocab_to_int[word] for word in review] for review in all_reviews]

here the variable review is a string. so [vocab_to_int[word] for word in review] would generate error because in place of words, we would get letters.

Here is the explanation

image So you will be trying to find index of each letter which will throw keyError.

I think it should be

encoded_reviews = [[vocab_to_int[word] for word in review.split()] for review in all_reviews]

Or you could modify your preprocess function Instead

def preprocess(text):
    text = text.lower()
    text = "".join([ch for ch in text if ch not in punctuation])
    all_reviews = [i.split() for i in text.split("\n")]
    text = " ".join(text)
    all_words = text.split()
    
    return all_reviews, all_words


so your reviews will look like

[ 
    ['hey', 'this', 'is' ,'review'].
    ['hey', 'this', 'is' ,'review2'],
    .
    .
    .
]

superryeti avatar Nov 12 '19 07:11 superryeti

def preprocess(text):
    text = text.lower()
    text = "".join([ch for ch in text if ch not in punctuation])
    all_reviews = [i for i in text.split("\n")]
    text = " ".join(text)
    all_words = text.split()
    
    return all_reviews, all_words

text = " ".join(text) should be text = " ".join(all_reviews)

actforjason avatar Feb 21 '21 06:02 actforjason