Python Programming Tutorials

Sentiment Analysis

I am following the Natural Language Processing series and I downloaded the positive.txt and negative.txt to form the training and the testing set. But I got MemoryError while running the program. So, I deleted some part from both of the txt files and it run perfectly. However, with sentences like "It was beautiful" or "It was lovely" I am getting the output as negative. What should I do?

You must be logged in to post. Please login or register an account.

It's not going to have an amazing accuracy, it's a simple example, and you had to get rid of training data, so that's not going to help either. You should only be concerned if most classifications are incorrect, or if the accuracy is close to 50%. 70% accuracy still means you get a lot wrong.

-Harrison 7 years ago

You must be logged in to post. Please login or register an account.

My accuracy was near 65-68% every time.
So I can not use the entire data to train because there I am running out of memory. But what if, I divide my training data set into multiple text files and in
classifier = nltk.NaiveBayesClassifier.train(training_set)
I keep on changing the training set? Will that have the same effect as using the entire data as the training set all at once?

-tupan 7 years ago

You must be logged in to post. Please login or register an account.