Creating Machine Learning Classifier Feature Sets - Python for Finance 15

Algorithmic trading with Python Tutorial




In this Finance with Python tutorial, we're going to be continuing to build our machine learning trading algorithm by building our feature sets.

To do this, we need to iterate through the historical prices, with windows of 10 prices, shifting one by one, building our feature sets.

We do this with the following block of code:

        while bar < len(price_list)-1:
            try:
                end_price = price_list[bar+1]
                begin_price = price_list[bar]
                
                pricing_list = []
                xx = 0
                for _ in range(context.feature_window):
                    price = price_list[bar-(context.feature_window-xx)]
                    pricing_list.append(price)
                    xx += 1
                    
                features = np.around(np.diff(pricing_list) / pricing_list[:-1] * 100.0, 1)
                
                if end_price > begin_price:
                    label = 1
                else:
                    label = -1

                bar += 1
                print(features)

            except Exception as e:
                bar += 1
                print(('feature creation',str(e)))

To explain, first we see:


                end_price = price_list[bar+1]
                begin_price = price_list[bar]

Here, we identify the "begin_price" as the price that we're currently at. We're trying to predict the next day's price as being higher or lower, so the end_price is just 1 more index value away.


                pricing_list = []
                xx = 0
                for _ in range(context.feature_window):
                    price = price_list[bar-(context.feature_window-xx)]
                    pricing_list.append(price)
                    xx += 1

Here, we populate our pricing list, or better called our feature list that we'll soon be using.


                features = np.around(np.diff(pricing_list) / pricing_list[:-1] * 100.0, 1)

Now, we convert this price list to a feature list, and the data is normalized to percent change.


                if end_price > begin_price:
                    label = 1
                else:
                    label = -1

Now, if the next day's price was a rise, great, this is a 1 (buy, positive, good outlook label for the feature set). If it was a drop, then the label is a -1 (sell, negative, bad outlook label for this feature set).

Now, as time goes on, we have these feature sets, and their labels, which we can save. We can then immediately use them to train a classifier, and then feed the classifier the current data to decide, based on previous data, whether today's price plus the last few days' prices are leading up to a rise or fall tomorrow.

Full code:

from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC, LinearSVC, NuSVC
from sklearn.ensemble import RandomForestClassifier
from sklearn import preprocessing
from collections import Counter
import numpy as np


def initialize(context):

    context.stocks = symbols('XLY',  # XLY Consumer Discrectionary SPDR Fund   
                           'XLF',  # XLF Financial SPDR Fund  
                           'XLK',  # XLK Technology SPDR Fund  
                           'XLE',  # XLE Energy SPDR Fund  
                           'XLV',  # XLV Health Care SPRD Fund  
                           'XLI',  # XLI Industrial SPDR Fund  
                           'XLP',  # XLP Consumer Staples SPDR Fund   
                           'XLB',  # XLB Materials SPDR Fund  
                           'XLU')  # XLU Utilities SPRD Fund
    
    context.historical_bars = 100
    context.feature_window = 10
    

   

def handle_data(context, data):
    prices = history(bar_count = context.historical_bars, frequency='1d', field='price')

    for stock in context.stocks:   
        ma1 = data[stock].mavg(50)
        ma2 = data[stock].mavg(200)
        
        start_bar = context.feature_window
        price_list = prices[stock].tolist()
        
        X = []
        y = []
        
        bar = start_bar
        
        # feature creation
        while bar < len(price_list)-1:
            try:
                end_price = price_list[bar+1]
                begin_price = price_list[bar]
                
                pricing_list = []
                xx = 0
                for _ in range(context.feature_window):
                    price = price_list[bar-(context.feature_window-xx)]
                    pricing_list.append(price)
                    xx += 1
                    
                features = np.around(np.diff(pricing_list) / pricing_list[:-1] * 100.0, 1)
                
                if end_price > begin_price:
                    label = 1
                else:
                    label = -1

                bar += 1
                print(features)

            except Exception as e:
                bar += 1
                print(('feature creation',str(e)))

There exists 2 quiz/question(s) for this tutorial. for access to these, video downloads, and no ads.

The next tutorial:





  • Programming for Finance with Python, Zipline and Quantopian
  • Programming for Finance Part 2 - Creating an automated trading strategy
  • Programming for Finance Part 3 - Back Testing Strategy
  • Accessing Fundamental company Data - Programming for Finance with Python - Part 4
  • Back-testing our strategy - Programming for Finance with Python - part 5
  • Strategy Sell Logic with Schedule Function with Quantopian - Python for Finance 6
  • Stop-Loss in our trading strategy - Python for Finance with Quantopian and Zipline 7
  • Achieving Targets - Python for Finance with Zipline and Quantopian 8
  • Quantopian Fetcher - Python for Finance with Zipline and Quantopian 9
  • Trading Logic with Sentiment Analysis Signals - Python for Finance 10
  • Shorting based on Sentiment Analysis signals - Python for Finance 11
  • Paper Trading a Strategy on Quantopian - Python for Finance 12
  • Understanding Hedgefund and other financial Objectives - Python for Finance 13
  • Building Machine Learning Framework - Python for Finance 14
  • Creating Machine Learning Classifier Feature Sets - Python for Finance 15
  • Creating our Machine Learning Classifiers - Python for Finance 16
  • Testing our Machine Learning Strategy - Python for Finance 17
  • Understanding Leverage - Python for Finance 18
  • Quantopian Pipeline Tutorial Introduction
  • Simple Quantopian Pipeline Strategy