TextBlob
TextBlob copied to clipboard
Memory Error on training a NaiveBayesClassifier
I tried to train the NaiveBayesClassifier on a set of ~39000 training examples, each is a string with its label. eg of a string: "The room was kind of clean but had a VERY strong smell of dogs. Generally below average but ok for a overnight stay if you're not too fussy. Would consider staying again if the price was right. Breakfast was free and just about better than nothing. --" It seems to have a memory error, is there anything I can do to resolve it.
Now to the error, I ran the following command which recieved the error:
cl = NaiveBayesClassifier(train)
MemoryError Traceback (most recent call last)
<ipython-input-4-fd1d192cc4f2> in <module>()
1 tick = time.time()
----> 2 cl = NaiveBayesClassifier(train)
3 print time.time()-tick
C:\Users\Carpe\Anaconda3\envs\zam\lib\site-packages\textblob\classifiers.pyc in __init__(self, train_set, feature_extractor, format, **kwargs)
204 feature_extractor=basic_extractor, format=None, **kwargs):
205 super(NLTKClassifier, self).__init__(train_set, feature_extractor, format, **kwargs)
--> 206 self.train_features = [(self.extract_features(d), c) for d, c in self.train_set]
207
208 def __repr__(self):
C:\Users\Carpe\Anaconda3\envs\zam\lib\site-packages\textblob\classifiers.pyc in extract_features(self, text)
181 # Feature extractor may take one or two arguments
182 try:
--> 183 return self.feature_extractor(text, self._word_set)
184 except (TypeError, AttributeError):
185 return self.feature_extractor(text)
C:\Users\Carpe\Anaconda3\envs\zam\lib\site-packages\textblob\classifiers.pyc in basic_extractor(document, train_set)
95 tokens = _get_document_tokens(document)
96 features = dict(((u'contains({0})'.format(word), (word in tokens))
---> 97 for word in word_features))
98 return features
99
MemoryError:
same wrong!!!
I encountered this error as well. Is there a way to determine if my computers resources are the issue or is there some optimization that needs to happen?
Same error here.
File "C:/Users/jorge/Documents/pycharm/Clasificadores/DataSetsClasificacion/clasificadores_Main.py", line 38, in Clasificadores cl = NaiveBayesClassifier(File_Input_Train, format=FormatTrain) File "C:\Users\jorge\Documents\pycharm\venv\lib\site-packages\textblob\classifiers.py", line 206, in init self.train_features = [(self.extract_features(d), c) for d, c in self.train_set] File "C:\Users\jorge\Documents\pycharm\venv\lib\site-packages\textblob\classifiers.py", line 183, in extract_features return self.feature_extractor(text, self._word_set) File "C:\Users\jorge\Documents\pycharm\venv\lib\site-packages\textblob\classifiers.py", line 97, in basic_extractor for word in word_features)) MemoryError
Facing same issue
I faced the same issue. Restarting my system solved it in no time. Don't know how it worked but it did.