Python Programming Tutorials

Deep Learning with TensorFlow - How the Network will run

Welcome to part four of Deep Learning with Neural Networks and TensorFlow, and part 46 of the Machine Learning tutorial series. In this tutorial, we're going to write the code for what happens during the Session in TensorFlow.

The code here has been updated to support TensorFlow 1.0, but the video has two lines that need to be slightly updated.

In the previous tutorial, we built the model for our Artificial Neural Network and set up the computation graph with TensorFlow. Now we need to actually set up the training process, which is what will be run in the TensorFlow Session. Continuing along in our code:

def train_neural_network(x):
    prediction = neural_network_model(x)
    cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y) )

Under a new function, train_neural_network, we will pass data. We then produce a prediction based on the output of that data through our neural_network_model. Next, we create a cost variable. This measures how wrong we are, and is the variable we desire to minimize by manipulating our weights. The cost function is synonymous with a loss function. To optimize our cost, we will use the AdamOptimizer, which is a popular optimizer along with others like Stochastic Gradient Descent and AdaGrad, for example.

    optimizer = tf.train.AdamOptimizer().minimize(cost)

Within AdamOptimizer(), you can optionally specify the learning_rate as a parameter. The default is 0.001, which is fine for most circumstances. Now that we have these things defined, we're going to begin the session.

    hm_epochs = 10
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())

First, we have a quick hm_epochs variable which will determine how many epochs to have (cycles of feed forward and back prop). Next, we're utilizing the with syntax for our session's opening and closing as discussed in the previous tutorial. To begin, we initialize all of our variables. Now come the main steps:

        for epoch in range(hm_epochs):
            epoch_loss = 0
            for _ in range(int(mnist.train.num_examples/batch_size)):
                epoch_x, epoch_y = mnist.train.next_batch(batch_size)
                _, c = sess.run([optimizer, cost], feed_dict={x: epoch_x, y: epoch_y})
                epoch_loss += c

            print('Epoch', epoch, 'completed out of',hm_epochs,'loss:',epoch_loss)

For each epoch, and for each batch in our data, we're going to run our optimizer and cost against our batch of data. To keep track of our loss/cost at each step of the way, we are adding the total cost per epoch up. For each epoch, we output the loss, which should be declining each time. This can be useful to track, so you can see the diminishing returns over time. The first few epochs should have massive improvements, but after about 10 or 20 you will be seeing very small, if any, changes, or you may actually get worse.

Now, outside of the epoch for loop:

        correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))

This will tell us how many predictions we made that were perfect matches to their labels.

        accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
        print('Accuracy:',accuracy.eval({x:mnist.test.images, y:mnist.test.labels}))

Now we have our ending accuracy on the testing set. Now all we have to do:

train_neural_network(x)

Somewhere between 10 and 20 epochs should give you ~95% accuracy. 95% accuracy, sounds great, but is actually considered to be very bad compared to more popular methods. I actually think 95% accuracy, with this model is nothing short of amazing. Consider that the only information we gave to our network was pixel values, that's it. We did not tell it about looking for patterns, or how to tell a 4 from a 9, or a 1 from a 8. The network simply figured it out with an inner model, based purely on pixel values to start, and achieved 95% accuracy. That's amazing to me, though state of the art is over 99%.

Full code up to this point:

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot = True)

n_nodes_hl1 = 500
n_nodes_hl2 = 500
n_nodes_hl3 = 500

n_classes = 10
batch_size = 100

x = tf.placeholder('float', [None, 784])
y = tf.placeholder('float')

def neural_network_model(data):
    hidden_1_layer = {'weights':tf.Variable(tf.random_normal([784, n_nodes_hl1])),
                      'biases':tf.Variable(tf.random_normal([n_nodes_hl1]))}

    hidden_2_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])),
                      'biases':tf.Variable(tf.random_normal([n_nodes_hl2]))}

    hidden_3_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2, n_nodes_hl3])),
                      'biases':tf.Variable(tf.random_normal([n_nodes_hl3]))}

    output_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl3, n_classes])),
                    'biases':tf.Variable(tf.random_normal([n_classes])),}


    l1 = tf.add(tf.matmul(data,hidden_1_layer['weights']), hidden_1_layer['biases'])
    l1 = tf.nn.relu(l1)

    l2 = tf.add(tf.matmul(l1,hidden_2_layer['weights']), hidden_2_layer['biases'])
    l2 = tf.nn.relu(l2)

    l3 = tf.add(tf.matmul(l2,hidden_3_layer['weights']), hidden_3_layer['biases'])
    l3 = tf.nn.relu(l3)

    output = tf.matmul(l3,output_layer['weights']) + output_layer['biases']

    return output

def train_neural_network(x):
    prediction = neural_network_model(x)
    # OLD VERSION:
    #cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(prediction,y) )
    # NEW:
    cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y) )
    optimizer = tf.train.AdamOptimizer().minimize(cost)
    
    hm_epochs = 10
    with tf.Session() as sess:
        # OLD:
        #sess.run(tf.initialize_all_variables())
        # NEW:
        sess.run(tf.global_variables_initializer())

        for epoch in range(hm_epochs):
            epoch_loss = 0
            for _ in range(int(mnist.train.num_examples/batch_size)):
                epoch_x, epoch_y = mnist.train.next_batch(batch_size)
                _, c = sess.run([optimizer, cost], feed_dict={x: epoch_x, y: epoch_y})
                epoch_loss += c

            print('Epoch', epoch, 'completed out of',hm_epochs,'loss:',epoch_loss)

        correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))

        accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
        print('Accuracy:',accuracy.eval({x:mnist.test.images, y:mnist.test.labels}))

train_neural_network(x)

In the next tutorial, we're going to attempt to take this exact model, and apply it to a new dataset that isn't so nicely prepared for us as this one was.

There exists 4 quiz/question(s) for this tutorial. for access to these, video downloads, and no ads.

The next tutorial:

Practical Machine Learning Tutorial with Python Introduction
Regression - Intro and Data
Regression - Features and Labels
Regression - Training and Testing
Regression - Forecasting and Predicting
Pickling and Scaling
Regression - Theory and how it works
Regression - How to program the Best Fit Slope
Regression - How to program the Best Fit Line
Regression - R Squared and Coefficient of Determination Theory
Regression - How to Program R Squared
Creating Sample Data for Testing
Classification Intro with K Nearest Neighbors
Applying K Nearest Neighbors to Data
Euclidean Distance theory
Creating a K Nearest Neighbors Classifer from scratch
Creating a K Nearest Neighbors Classifer from scratch part 2
Testing our K Nearest Neighbors classifier
Final thoughts on K Nearest Neighbors
Support Vector Machine introduction
Vector Basics
Support Vector Assertions
Support Vector Machine Fundamentals
Constraint Optimization with Support Vector Machine
Beginning SVM from Scratch in Python
Support Vector Machine Optimization in Python
Support Vector Machine Optimization in Python part 2
Visualization and Predicting with our Custom SVM
Kernels Introduction
Why Kernels
Soft Margin Support Vector Machine
Kernels, Soft Margin SVM, and Quadratic Programming with Python and CVXOPT
Support Vector Machine Parameters
Machine Learning - Clustering Introduction
Handling Non-Numerical Data for Machine Learning
K-Means with Titanic Dataset
K-Means from Scratch in Python
Finishing K-Means from Scratch in Python
Hierarchical Clustering with Mean Shift Introduction
Mean Shift applied to Titanic Dataset
Mean Shift algorithm from scratch in Python
Dynamically Weighted Bandwidth for Mean Shift
Introduction to Neural Networks
Installing TensorFlow for Deep Learning - OPTIONAL
Introduction to Deep Learning with TensorFlow
Deep Learning with TensorFlow - Creating the Neural Network Model
Deep Learning with TensorFlow - How the Network will run
Deep Learning with our own Data
Simple Preprocessing Language Data for Deep Learning
Training and Testing on our Data for Deep Learning
10K samples compared to 1.6 million samples with Deep Learning
How to use CUDA and the GPU Version of Tensorflow for Deep Learning
Recurrent Neural Network (RNN) basics and the Long Short Term Memory (LSTM) cell
RNN w/ LSTM cell example in TensorFlow and Python
Convolutional Neural Network (CNN) basics
Convolutional Neural Network CNN with TensorFlow tutorial
TFLearn - High Level Abstraction Layer for TensorFlow Tutorial
Using a 3D Convolutional Neural Network on medical imaging data (CT Scans) for Kaggle
Classifying Cats vs Dogs with a Convolutional Neural Network on Kaggle
Using a neural network to solve OpenAI's CartPole balancing environment