Problem with MNB_classifier from NLTK learning tutorial

by: samcool2010, 7 years ago


Hey I followed through your NLTK tutorial. I am trying to run the classifiers for the first time and I keep getting a memory error. I copied the code from the webpage but I cant figure out what is causing the memory error and it seems to be within the sklearn package.

Traceback (most recent call last):
  File "C:UserssamueDocumentsDataMiningsentiment.py", line 109, in <module>
    MNB_classifier.train(training_set)
  File "C:UserssamueAppDataLocalProgramsPythonPython36-32libsite-packagesnltkclassifyscikitlearn.py", line 117, in train
    X = self._vectorizer.fit_transform(X)
  File "C:UserssamueAppDataLocalProgramsPythonPython36-32libsite-packagessklearnfeature_extractiondict_vectorizer.py", line 230, in fit_transform
    return self._transform(X, fitting=True)
  File "C:UserssamueAppDataLocalProgramsPythonPython36-32libsite-packagessklearnfeature_extractiondict_vectorizer.py", line 172, in _transform
    values.append(dtype(v))
MemoryError


Any idea what I need to do to make this work?



You must be logged in to post. Please login or register an account.



I found the solution. It seems like my computer runs out of memory when word_features is set too high. The code example is 5000. I found that the max my computer would take was about 1300. I still get roughly the same accuracy percentage with 1300 so I am satisfied with that.

 word_features = list(all_words.keys())[:1300] 


-samcool2010 7 years ago

You must be logged in to post. Please login or register an account.