Welcome to Part 10 of the creating and Artificial Intelligence bot in StarCraft II with Python series. In this tutorial, we're going to be working on the creation of our model.
The focus here is just purely to see if a model can learn from this style of input data. The training data I built and that I will be using can be found here: Stage 1, 2868 games data vs Hard AI . You do not necessarily need to grab this data, it will not be our final data if successful.
Once you have the data, extract it, and you're ready to rumble. First we need to devise the structure of our convolutional neural network. I will also be making use of Keras, which is a framework that sits on top of TensorFlow. So long as you already have TensorFlow, it's just a pip install. I am using Keras version 2.1.2, and TensorFlow version 1.8.0 here.
If you do not know about neural networks, it is advised you visit at the very least the deep learning tutorials from the machine learning series.
To begin, I was actually having a hard time getting anything to learn. In the Halite II competition, I also found this to be quite a challenge. I found that an exceptionally low starting learning rate was the solution. That model started with a 1e-5 learning rate and ended on a 1e-6 (0.00001 to 0.000001). Normally, you will start with more like 1e-3, and stop at 1e-4 (0.001 to 0.0001). Here, I found that starting at 1e-4 was enough to begin learning. For this part, our model script will begin with the following imports:
import keras # Keras 2.1.2 and TF-GPU 1.8.0 from keras.models import Sequential from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D from keras.callbacks import TensorBoard import numpy as np import os import random
We're going to import Keras, obviously, but then also specifically the Sequential
model type, dense
layers, dropout
, and flatten
(to flatten the data before passing through the final, regular dense layer). Finally, we're using a convolutional neural network, so we're going to use Conv2D
and MaxPooling2D
for that. I also want to be able to visualize the model's training, so we'll be using TensorBoard
.
Our data is stored from numpy, so we'll use numpy
to load it back in, as well as shape it. We're going to use os
to iterate over the directory containing the data, and random
to shuffle it about.
Now, let's build our model:
model = Sequential()
This just means that we have a regular type of model. Things are going to go in order.
Now for our main hidden convolutional layers:
model.add(Conv2D(32, (3, 3), padding='same', input_shape=(176, 200, 3), activation='relu')) model.add(Conv2D(32, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.2)) model.add(Conv2D(64, (3, 3), padding='same', activation='relu')) model.add(Conv2D(64, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.2)) model.add(Conv2D(128, (3, 3), padding='same', activation='relu')) model.add(Conv2D(128, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.2))
Next we'll add one fully-connected dense layer:
model.add(Flatten()) model.add(Dense(512, activation='relu')) model.add(Dropout(0.5))
Finally the output layer:
model.add(Dense(4, activation='softmax'))
Now we need to setup the compile settings for the network:
learning_rate = 0.0001 opt = keras.optimizers.adam(lr=learning_rate, decay=1e-6) model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
Lastly, we want to log everything via TensorBoard, so we'll call it:
tensorboard = TensorBoard(log_dir="logs/stage1")
Now that we have our model, we need to pass the data through. Since our data already exceeds my GPUs VRAM, and knowing that our future data will likely vastly exceed it, we need to work on a way to load the data in by batches, which is what we'll be working on in the next tutorial.