TextBlob icon indicating copy to clipboard operation
TextBlob copied to clipboard

Memory Error on training a NaiveBayesClassifier

Open lifeofzi opened this issue 7 years ago • 5 comments

I tried to train the NaiveBayesClassifier on a set of ~39000 training examples, each is a string with its label. eg of a string: "The room was kind of clean but had a VERY strong smell of dogs. Generally below average but ok for a overnight stay if you're not too fussy. Would consider staying again if the price was right. Breakfast was free and just about better than nothing. --" It seems to have a memory error, is there anything I can do to resolve it.

Now to the error, I ran the following command which recieved the error: cl = NaiveBayesClassifier(train)

MemoryError                               Traceback (most recent call last)
<ipython-input-4-fd1d192cc4f2> in <module>()
      1 tick = time.time()
----> 2 cl = NaiveBayesClassifier(train)
      3 print time.time()-tick

C:\Users\Carpe\Anaconda3\envs\zam\lib\site-packages\textblob\classifiers.pyc in __init__(self, train_set, feature_extractor, format, **kwargs)
    204                  feature_extractor=basic_extractor, format=None, **kwargs):
    205         super(NLTKClassifier, self).__init__(train_set, feature_extractor, format, **kwargs)
--> 206         self.train_features = [(self.extract_features(d), c) for d, c in self.train_set]
    207 
    208     def __repr__(self):

C:\Users\Carpe\Anaconda3\envs\zam\lib\site-packages\textblob\classifiers.pyc in extract_features(self, text)
    181         # Feature extractor may take one or two arguments
    182         try:
--> 183             return self.feature_extractor(text, self._word_set)
    184         except (TypeError, AttributeError):
    185             return self.feature_extractor(text)

C:\Users\Carpe\Anaconda3\envs\zam\lib\site-packages\textblob\classifiers.pyc in basic_extractor(document, train_set)
     95     tokens = _get_document_tokens(document)
     96     features = dict(((u'contains({0})'.format(word), (word in tokens))
---> 97                                             for word in word_features))
     98     return features
     99 

MemoryError:

lifeofzi avatar Oct 11 '17 19:10 lifeofzi

same wrong!!!

cuong369 avatar Jun 20 '18 17:06 cuong369

I encountered this error as well. Is there a way to determine if my computers resources are the issue or is there some optimization that needs to happen?

cmazzoni87 avatar Jul 09 '18 03:07 cmazzoni87

Same error here.

File "C:/Users/jorge/Documents/pycharm/Clasificadores/DataSetsClasificacion/clasificadores_Main.py", line 38, in Clasificadores cl = NaiveBayesClassifier(File_Input_Train, format=FormatTrain) File "C:\Users\jorge\Documents\pycharm\venv\lib\site-packages\textblob\classifiers.py", line 206, in init self.train_features = [(self.extract_features(d), c) for d, c in self.train_set] File "C:\Users\jorge\Documents\pycharm\venv\lib\site-packages\textblob\classifiers.py", line 183, in extract_features return self.feature_extractor(text, self._word_set) File "C:\Users\jorge\Documents\pycharm\venv\lib\site-packages\textblob\classifiers.py", line 97, in basic_extractor for word in word_features)) MemoryError

Jor-G-ete avatar Jul 11 '18 12:07 Jor-G-ete

Facing same issue

arnabchoudhury1 avatar Mar 26 '19 07:03 arnabchoudhury1

I faced the same issue. Restarting my system solved it in no time. Don't know how it worked but it did.

gupta-sarthak avatar Jan 21 '20 04:01 gupta-sarthak