# j is adject, r is adverb, and v is verb #allowed_word_types = ["J","R","V"] allowed_word_types = ["J"]
for p in short_pos.split('n'): documents.append((p, "pos")) words = word_tokenize(p) pos = nltk.pos_tag(words) for w in pos: if w[1][0] in allowed_word_types: all_words.append(w[0].lower())
for p in short_neg.split('n'): documents.append((p, "neg")) words = word_tokenize(p) pos = nltk.pos_tag(words) for w in pos: if w[1][0] in allowed_word_types: all_words.append(w[0].lower())
getting the following error at MNB_classifier.train: -
Traceback (most recent call last): File "D:Userssiddharth.uDocumentsNatural Language ProcessingCreating a module for Sentiment Analysis with NLTKCreating a module for Sentiment Analysis with NLTK.py", line 103, in <module> MNB_classifier.train(training_set) File "D:Userssiddharth.uAppDataLocalContinuumAnaconda3libsite-packagesnltkclassifyscikitlearn.py", line 117, in train X = self._vectorizer.fit_transform(X) File "D:Userssiddharth.uAppDataLocalContinuumAnaconda3libsite-packagessklearnfeature_extractiondict_vectorizer.py", line 230, in fit_transform return self._transform(X, fitting=True) File "D:Userssiddharth.uAppDataLocalContinuumAnaconda3libsite-packagessklearnfeature_extractiondict_vectorizer.py", line 171, in _transform indices.append(vocab[f]) MemoryError
You must be logged in to post. Please login or register an account.
The MemoryError indicates, that you are either running out of ram or you are using the 32 Bit version of python and are hitting the 2gb limit of the process. As long as the 32 Bit Python is the limiting factor, you can just switch to the 64 Bit version and should be fine. If you are running out of actual ram, it's a different story.
-Tmesus 7 years ago
You must be logged in to post. Please login or register an account.