Welcome to Part 12 of the Python Plays: Grand Theft Auto V tutorial series, where we're working on creating a self-driving car in the game.

In the previous tutorial, we trained a convolutional neural network on some game data, and now we're ready to see how we've done. While we trained the convolutional neural network, we saved our progress to a model file. This lets us easily load back in this model and either use it, or even train it some more. To load a TensorFlow model file, we need to have the model defined first. The TensorFlow model file is just a file that contains the network connection weights, nothing more. For this reason, in order for this model file to work, we need the model that we're going to load it do defined already.

We've separated out the model file already, so you should already have it. Just in case:

# alexnet.py

import tflearn
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression
from tflearn.layers.normalization import local_response_normalization

def alexnet(width, height, lr):
    network = input_data(shape=[None, width, height, 1], name='input')
    network = conv_2d(network, 96, 11, strides=4, activation='relu')
    network = max_pool_2d(network, 3, strides=2)
    network = local_response_normalization(network)
    network = conv_2d(network, 256, 5, activation='relu')
    network = max_pool_2d(network, 3, strides=2)
    network = local_response_normalization(network)
    network = conv_2d(network, 384, 3, activation='relu')
    network = conv_2d(network, 384, 3, activation='relu')
    network = conv_2d(network, 256, 3, activation='relu')
    network = max_pool_2d(network, 3, strides=2)
    network = local_response_normalization(network)
    network = fully_connected(network, 4096, activation='tanh')
    network = dropout(network, 0.5)
    network = fully_connected(network, 4096, activation='tanh')
    network = dropout(network, 0.5)
    network = fully_connected(network, 3, activation='softmax')
    network = regression(network, optimizer='momentum',
                         loss='categorical_crossentropy',
                         learning_rate=lr, name='targets')

    model = tflearn.DNN(network, checkpoint_path='model_alexnet',
                        max_checkpoints=1, tensorboard_verbose=2, tensorboard_dir='log')

    return model

Next, let's start a new file called test_model.py:

# test_model.py

import numpy as np
from PIL import ImageGrab
import cv2
import time
from directkeys import PressKey,ReleaseKey, W, A, S, D
from alexnet import alexnet

WIDTH = 80
HEIGHT = 60
LR = 1e-3
EPOCHS = 8
MODEL_NAME = 'pygta5-car-{}-{}-{}-epochs.model'.format(LR, 'alexnetv2',EPOCHS)

Next, we our model is going to output [1,0,0], [0,1,0] or [0,0,1], but these outputs correspond to left, straight, or right, so we want to have these outputs mapped to some actual actions, here they are:

def straight():
    PressKey(W)
    ReleaseKey(A)
    ReleaseKey(D)

def left():
    PressKey(W)
    PressKey(A)
    ReleaseKey(D)

def right():
    PressKey(W)
    PressKey(D)
    ReleaseKey(A)

These mappings are just a start. Notice that we're actually always pressing "W" and giving gas. We may need to tweak this in time, but that's what we have for now. Next, let's define the model, and load in our weights from training:

model = alexnet(WIDTH, HEIGHT, LR)
model.load(MODEL_NAME)

Now we begin the loop, which is very similar to our previous self-driving car loop, with just some small tweaks, and with the decision coming from our model via model.predict instead:

def main():
    last_time = time.time()
    
    for i in list(range(4))[::-1]:
        print(i+1)
        time.sleep(1)
        
    while(True):
        # 800x600 windowed mode
        screen =  np.array(ImageGrab.grab(bbox=(0,40,800,640)))
        print('loop took {} seconds'.format(time.time()-last_time))
        last_time = time.time()
        screen = cv2.cvtColor(screen, cv2.COLOR_BGR2GRAY)
        screen = cv2.resize(screen, (80,60))
        cv2.imshow('',screen)
        moves = list(np.around(model.predict([screen.reshape(80,60,1)])[0]))
        print(moves)

        if moves == [1,0,0]:
            left()
        elif moves == [0,1,0]:
            straight()
        elif moves == [0,0,1]:
            right()

        if cv2.waitKey(25) & 0xFF == ord('q'):
            cv2.destroyAllWindows()
            break

I have personally found that, unlike the previous self-driving car script, this one probably should be started at full speed. You should run this, immediately go to the game, and get going as fast as possible before the script takes over.

At speed, the model actually does alright, which is impressive considering our training data is only something like 20,000 frames (after balancing).

One thing that you might find annoying is starting/stopping the AI for testing. It wont take long to learn that this thing needs work, but, I want to be able to more quickly stop and start the self-driving aspect. Thus, let's make some slight changes to the test_model.py file:

from getkeys import key_check

Now we can use this to detect a specific key that we might want to use to pause things, now let's modify the main function a bit:

def main():
    last_time = time.time()
    for i in list(range(4))[::-1]:
        print(i+1)
        time.sleep(1)

    paused = False
    while(True):
        
        if not paused:
            # 800x600 windowed mode
            screen =  np.array(ImageGrab.grab(bbox=(0,40,800,640)))
            print('loop took {} seconds'.format(time.time()-last_time))
            last_time = time.time()
            screen = cv2.cvtColor(screen, cv2.COLOR_BGR2GRAY)
            screen = cv2.resize(screen, (80,60))
            moves = list(np.around(model.predict([screen.reshape(80,60,1)])[0]))
            if moves == [1,0,0]:
                left()
            elif moves == [0,1,0]:
                straight()
            elif moves == [0,0,1]:
                right()
   
        keys = key_check()

        # p pauses game and can get annoying.
        if 'T' in keys:
            if paused:
                paused = False
                time.sleep(1)
            else:
                paused = True
                ReleaseKey(A)
                ReleaseKey(W)
                ReleaseKey(D)
                time.sleep(1)

I just arbitrarily picked "T" to pause the game. I would go with P, but P pauses the actual game too, which I don't really want. I want to pause the AI, reset things, and then start it back up without having the pause screen in the way, so T it is. I am not too focused on perfecting this, but I did notice that sometimes you need to really hold the key to get things to pause. To un-pause, a quick press is usually all that is required.

Anyway, the Agent's artificial intelligence isn't the greatest, but seems to do well on darker roads with well-painted lines. Not bad, and we're clearly on to something.

Testing self-driving car neural network- Python Plays GTA V