Python Programming Tutorials

Reading from our sentiment database - Sentiment Analysis GUI with Dash and Python p.3

Hello and welcome to part 3 of our sentiment analysis visualization application project with Dash. Leading up to this part, we learned how to calculate senitment on strings, how to stream data from Twitter, and now we're ready to tie it in to Dash. If you are not already familiar with Dash, check out the Dash tutorials.

In our Dash application, we need some way to interact with our database. Because we are fairly likely to do various calcualtions on this data, rather than displaying the raw sentiment, I am going to use pandas. If you do not have pandas: pip install pandas. Pandas is a popular data analysis library for Python. You can think of it like Excel for Python. If you would like to learn more, check out the Data Analysis with Python and Pandas Tutorials.

Let's quickly cover how we can read data from a database to a dataframe, and then a simple calculation we might make:

import sqlite3
import pandas as pd

conn = sqlite3.connect('twitter.db')
c = conn.cursor()

df = pd.read_sql("SELECT * FROM sentiment WHERE tweet LIKE '%olympic%' ORDER BY unix DESC LIMIT 1000", conn)
print(df.tail())

To load a database into a dataframe, we first pass the SQL query, and then the connection object. In this case, our SQL query is hunting for any tweets that contain the substring of "olympic" in them. Since the olympics are going on at the time of my writing this, it's a fairly common subject at the moment being talked about. For now, we're also limiting the return to be 1000, sorting by the unix time, descending (so we get the latest tweets).

Example output:

             unix                                              tweet  \
995  1.519571e+12  RT @NekoDaice: I'm not an exo fan but damn tha...
996  1.519571e+12  RT @Workpoint_Ent: `yaakcchaekhaaaipfang I'm T...
997  1.519571e+12  RT @MomentsAU: EXO and CL didn't disappoint at...
998  1.519571e+12  RT @smolkjd: how much are we gonna bet some EX...
999  1.519571e+12  RT @MartinBelam: I picked ten highlights from ...

     sentiment
995    -0.8972
996     0.6369
997     0.3089
998     0.4939
999     0.5859

There's a quick snippet of our data, looking good. Now, the raw sentiment scores are going to be all over the place and noisy. We're going to want to smooth them. Let's do this with a moving average. We don't want just any moving average though, we want something that is suitable for the size of data we might be finding.

Also, since we've sorted by unix time in descending order, we probably want to flip our data back around to be in proper chronological order, with:

df.sort_values('unix', inplace=True)

Now to smooth things out:

df['sentiment_smoothed'] = df['sentiment'].rolling(int(len(df)/5)).mean()
df.dropna(inplace=True)
print(df.tail())

Full code up to this point:

import sqlite3
import pandas as pd

conn = sqlite3.connect('twitter.db')
c = conn.cursor()

df = pd.read_sql("SELECT * FROM sentiment WHERE tweet LIKE '%olympic%' ORDER BY unix DESC LIMIT 1000", conn)
df.sort_values('unix', inplace=True)

df['sentiment_smoothed'] = df['sentiment'].rolling(int(len(df)/5)).mean()
df.dropna(inplace=True)
print(df.tail())

Output:

           unix                                              tweet  sentiment  \
4  1.519575e+12  RT @MichaelG1: This is what happens when a spo...     0.0000
3  1.519575e+12  RT @sebaeksalot: If I had to choose between tw...     0.4767
2  1.519575e+12  I LOVE YOU ALL SO MUCH  YOU DESERVE EVERYTHING...     0.9374
1  1.519575e+12  RT @morningpassages: And full capture of epic ...     0.0000
0  1.519575e+12  RT @MichaelG1: This is what happens when a spo...     0.0000

   sentiment_smoothed
4            0.173937
3            0.174035
2            0.178722
1            0.178082
0            0.174713

Okay, at this point, we should have everything we need to create our first basic Dash app that will display live Twitter sentiment for any word we want. Let's do that in the next tutorial.

The next tutorial:

Intro - Data Visualization Applications with Dash and Python p.1
Interactive User Interface - Data Visualization GUIs with Dash and Python p.2
Dynamic Graph based on User Input - Data Visualization GUIs with Dash and Python p.3
Live Graphs - Data Visualization GUIs with Dash and Python p.4
Vehicle Data App Example - Data Visualization GUIs with Dash and Python p.5
Out of the Box Sentiment Analysis options with Python using VADER Sentiment and TextBlob
Streaming Tweets and Sentiment from Twitter in Python - Sentiment Analysis GUI with Dash and Python p.2
Reading from our sentiment database - Sentiment Analysis GUI with Dash and Python p.3
Live Twitter Sentiment Graph - Sentiment Analysis GUI with Dash and Python p.4
Dynamically Graphing Terms for Sentiment - Sentiment Analysis GUI with Dash and Python p.5
Deploy Dash App to a VPS web server - Data Visualization Applications with Dash and Python p.11