Python Programming Tutorials

Using Quandl for more data

In this machine learning tutorial, we're going to discuss using Quandl for acquiring better data. Up to this point, we've been taking the current stock's performance and comparing it to its current key statistics. The problem here is that, while we can perform machine learning on this, we cannot actually invest based on our findings.

Instead, we need to pull the key statistics, and then check what the stock price was at that time, and then what the price is a year from then. This will tell us better what key statistics lead to out performance.

We've covered downloading the csv manually from Quandl, but now we've got a pretty large order of stocks, so we want to do this with our program.

First, you're going to need the quandl package. This isn't totally necessary, as pulling from the API is quite simple with or without the package, but it does make it a bit easier and knocks out a few steps. The Quandl package is here.

In order to install this for Python 3, modify the setup.py file's print statements (they are 2.7 syntax).

If setup.py doesn't work for you, then just manually move the package right in. So, when you've downloaded Quandl and extracted it, you should have a "Quandl" directory from the download.

Next, what you'll do is move that Quandl directory into C:/Python34/Lib/Site-Packages/

Then try to import Quandl. If you're having trouble, check the video and/or leave a comment and I will try to help.

Now, when you want a data set, you will just need to use the tag. To get that, look to the right bar and then click on "python." That will give you the "tag." In the case of the video, we see clicking the tag gives us: Quandl.get("WIKI/AAPL") so we see the official tag here is "WIKI/AAPL."

We have that, and then we're ready to pull. With Quandl, you can actually pull multiple tickers at once, but the problem is we just want a single column, and we want to rename that column.

To pull just one stock, for example, you'll do the following:

import pandas as pd
import os
from Quandl import Quandl
import time

auth_tok = "yourauthhere"

data = Quandl.get("WIKI/KO", trim_start = "2000-12-12", trim_end = "2014-12-30", authtoken=auth_tok)

print(data)

If you'll notice, we added some extra commands to this Quandl.get statement. First, we've added a trim start and end. We do this so we can just get a slice of the data that we want.

Your auth token can be found by going into your Quandl account. You can pull something like 50 free pulls per IP address, but, if you make a free account, you can pull some massive amount of requests, so I suggest you just make an account with Quandl.

The next tutorial:

Intro to Machine Learning with Scikit Learn and Python
Simple Support Vector Machine (SVM) example with character recognition
Our Method and where we will be getting our Data
Parsing data
More Parsing
Structuring data with Pandas
Getting more data and meshing data sets
Labeling of data part 1
Labeling data part 2
Finally finishing up the labeling
Linear SVC Machine learning SVM example with Python
Getting more features from our data
Linear SVC machine learning and testing our data
Scaling, Normalizing, and machine learning with many features
Shuffling our data to solve a learning issue
Using Quandl for more data
Improving our Analysis with a more accurate measure of performance in relation to fundamentals
Learning and Testing our Machine learning algorithm
More testing, this time including N/A data
Back-testing the strategy
Pulling current data from Yahoo
Building our New Data-set
Searching for investment suggestions
Raising investment requirement standards
Testing raised standards
Streamlining the changing of standards