Python Programming Tutorials

Twitter Sentiment Analysis with NLTK

Now that we have a sentiment analysis module, we can apply it to just about any text, but preferrably short bits of text, like from Twitter! To do this, we're going to combine this tutorial with the Twitter streaming API tutorial.

The initial code from that tutorial is:

from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener


#consumer key, consumer secret, access token, access secret.
ckey="fsdfasdfsafsffa"
csecret="asdfsadfsadfsadf"
atoken="asdf-aassdfs"
asecret="asdfsadfsdafsdafs"

class listener(StreamListener):

    def on_data(self, data):
        print(data)
        return(True)

    def on_error(self, status):
        print status

auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)

twitterStream = Stream(auth, listener())
twitterStream.filter(track=["car"])

That is enough to print out all of the data for the streaming live tweets that contain the term "car." We can use the json module to load the data var with json.loads(data), and then we can reference the tweet specifically with:

tweet = all_data["text"]

Now that we have a tweet, we can easily pass this through our sentiment_mod module!

from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import json
import sentiment_mod as s

#consumer key, consumer secret, access token, access secret.
ckey="asdfsafsafsaf"
csecret="asdfasdfsadfsa"
atoken="asdfsadfsafsaf-asdfsaf"
asecret="asdfsadfsadfsadfsadfsad"

from twitterapistuff import *

class listener(StreamListener):

    def on_data(self, data):

		all_data = json.loads(data)

		tweet = all_data["text"]
		sentiment_value, confidence = s.sentiment(tweet)
		print(tweet, sentiment_value, confidence)

		if confidence*100 >= 80:
			output = open("twitter-out.txt","a")
			output.write(sentiment_value)
			output.write('\n')
			output.close()

		return True

    def on_error(self, status):
        print(status)

auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)

twitterStream = Stream(auth, listener())
twitterStream.filter(track=["happy"])

Along with that, we're also saving the results to an output file, twitter-out.txt.

Next, what data analysis would be complete without graphs? Let's combine yet another tutorial with this one to make a live streaming graph from the sentiment analysis on the Twitter API!

The next tutorial:

Tokenizing Words and Sentences with NLTK
Stop words with NLTK
Stemming words with NLTK
Part of Speech Tagging with NLTK
Chunking with NLTK
Chinking with NLTK
Named Entity Recognition with NLTK
Lemmatizing with NLTK
The corpora with NLTK
Wordnet with NLTK
Text Classification with NLTK
Converting words to Features with NLTK
Naive Bayes Classifier with NLTK
Saving Classifiers with NLTK
Scikit-Learn Sklearn with NLTK
Combining Algorithms with NLTK
Investigating bias with NLTK
Improving Training Data for sentiment analysis with NLTK
Creating a module for Sentiment Analysis with NLTK
Twitter Sentiment Analysis with NLTK
Graphing Live Twitter Sentiment Analysis with NLTK with NLTK
Named Entity Recognition with Stanford NER Tagger
Testing NLTK and Stanford NER Taggers for Accuracy
Testing NLTK and Stanford NER Taggers for Speed
Using BIO Tags to Create Readable Named Entity Lists