Python Programming Tutorials

Training Model Deep Learning - Halite II 2017 Artificial Intelligence Competition p.6

Welcome to part 6 of the Halite II tutorials, and part 3 of the deep learning with Halite tutorials. In this tutorial, we're going to write a script to train a model based on our winner's training data.

To do this, we're going to be using Keras, which is a high-level API that sits on top of TensorFlow to make writing our neural networks easier.

To follow this, you will need TensorFlow Installed and Keras installed (pip install keras). If you do not have a GPU to run TensorFlow on, then you can just do a pip install tensorflow. If you do have a TensorFlow-capable GPU (Nvidia GPU with CUDA compute capability of 6 or higher), then you can install the GPU version. Here's a tutorial for installing GPU tensorflow on Linux and here's a tutorial for installing GPU tensorflow on Windows.

To begin:

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.models import load_model
import random
from tqdm import tqdm
import numpy as np

Let's set some defaults now:

batch_size = 128
epochs = 10
test_size = 5000

Batch size is how many samples to feed through the network at a time. In general, a low batch size will take a very long time to train with, and may not fit at all. A too-high batch size can speed things up, but also result in over-fitting. Epochs measure how many times we fully cycle through the full dataset in training. Somewhere between 1 and 10 is a good number to go with. Test size is how many samples would we like to reserve for out-of-sample testing.

Next, let's define our model names. Many times, you may train a pre-existing model. We do not have one, but we might in the future, then we want to name our current model:

in_model = 'model_checkpoint_{}_batch_{}_epochs.h5'.format(0,0)
out_model = 'model_checkpoint_{}_batch_{}_epochs.h5'.format(batch_size, epochs)

Now, if we do want to load a previous model, we'll do that. This is our first time through, so we will set the decision to load a model to false, but let's have the code:

load_prev_model = False

if load_prev_model:
    print("Loading model: ", in_model)
    model = load_model(in_model)
else:
    print("starting fresh!")

Next, we need to load in the training data:

print("Reading input")
with open("train.in","r") as f:
    train_in = f.read().split('\n')
    train_in = [eval(i) for i in tqdm(train_in[:-1])]
    print("done train in")
    
print("Reading output")
with open("train.out","r") as f:
    train_out = f.read().split('\n')
    train_out = [eval(i) for i in tqdm(train_out[:-1])]
    print("done train out")

Easy enough, we read in the data, which is a string of the vectors, so we'll use eval to convert this data. Next, we need to properly balance the training data. In general with training data from the real world, it's not perfectly balanced. What I mean by that is we probably have far more samples of data that translate to "attack" than we do for "mine empty planet," since most of the game is spent attacking other players, and really only a minority of our ships can be miners if the game progresses very far.

The problem with unbalanced data is that the network doesn't know anything about balance, it just wants to fit your data. If 75% of your data says "attack," then the model will learn that the best thing to do always is just to attack, and it will never mine, which will be problematic. Instead, we want to balance all of the data evenly, so let's do that. First, let's build lists of each of the choices:

attack_enemy = []
mine_our_planet = []
mine_empty_planet = []

print("balancing data...")
for n, _ in tqdm(enumerate(train_in)):
    input_layer = train_in[n]
    output_layer = train_out[n]

    if output_layer == [1,0,0]:
        attack_enemy.append([input_layer, output_layer])
    elif output_layer == [0,1,0]:
        mine_our_planet.append([input_layer, output_layer])
    elif output_layer == [0,0,1]:
        mine_empty_planet.append([input_layer, output_layer])

Now, just for debugging purposes, let's check the lengths:

print(len(attack_enemy), len(mine_our_planet), len(mine_empty_planet))

Now let's grab whatever the shortest one is:

shortest = min(len(attack_enemy), len(mine_our_planet), len(mine_empty_planet))

Next, let's shuffle all of these:

random.shuffle(attack_enemy)
random.shuffle(mine_our_planet)
random.shuffle(mine_empty_planet)

Finally, let's trim all of these to be as long as the shortest one, so we're evenly balanced:

attack_enemy = attack_enemy[:shortest]
mine_our_planet = mine_our_planet[:shortest]
mine_empty_planet = mine_empty_planet[:shortest]

Now let's check lengths:

print(len(attack_enemy), len(mine_our_planet), len(mine_empty_planet))

Let's add them all together, and shuffle them now:

all_choices = attack_enemy + mine_our_planet + mine_empty_planet
random.shuffle(all_choices)

Great, now we have balanced and shuffled training data. We want to shuffle the data so we don't feed large batches of identically-labeled data, which would be very confusing to the machine.

Now, because machine learning classifiers take input and produce output, or, when training, take input and attempt to fit to output, we need to re-split up this data:

train_in = []
train_out = []

print("rebuilding training data...")
for x,y in tqdm(all_choices):
    train_in.append(x)
    train_out.append(y)

Now, because you may want to try various models, and this training data can be quite large, let's save this training data so we don't necessarily need to perform all these operations if we just want to try more epochs or a different shaped model:

np.save("train_in.npy", train_in)
np.save("train_out.npy", train_out)

Now, everything up to this point can be skipped with:

train_in = np.load("train_in.npy")
train_out = np.load("train_out.npy")

Alrighty, let's split up the data into training and testing data. The training data will be fit by the model, and the testing data will be tested against. The hope is that the model's "accuracy" matches the "testing accuracy" pretty closely. If the model's accuracy is something like 90%, but the testing accuracy is 60%, this tells us that we've almost certainly over-fit the data in training.

x_train = train_in[:-test_size]
y_train = train_out[:-test_size]

x_test = train_in[-test_size:]
y_test = train_out[-test_size:]

Now, if we didn't load a model, then we need to create one:

print('Building model...')
if not load_prev_model:
    model = Sequential()
    model.add(Dense(256, input_shape=(len(train_in[0]),)))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(256, input_shape=(256,)))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(len(train_out[0])))
    model.add(Activation('softmax'))

    model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])

In this case, we're building a 2x256 hidden-layered neural network with a 50% dropout between the layers, with rectified linear activation and the adam optimizer. If you do not know what any of that means, and would like to learn more, check out the deep learning tutorials on this website.

Now we fit (train) the model:

history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_split=0.1)

While fitting, we will just see the model's "accuracy" in accordance with the training data, which is not "out of sample" (meaning it's not data that wasn't used in training). Once we've fully trained the model, we should test it with data that definitely wasn't used for training:

score = model.evaluate(x_test, y_test,
                       batch_size=batch_size, verbose=1)

Now, we'll save the model and output some stats:

model.save(out_model)
print("Model saved to:",out_model)
print('Test score:', score[0])
print('Test accuracy:', score[1])

Full code:

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.models import load_model
import random
from tqdm import tqdm
import numpy as np

batch_size = 128
epochs = 10
test_size = 5000

in_model = 'model_checkpoint_{}_batch_{}_epochs.h5'.format(0,0)
out_model = 'model_checkpoint_{}_batch_{}_epochs.h5'.format(batch_size, epochs)

load_prev_model = False

if load_prev_model:
    print("Loading model: ", in_model)
    model = load_model(in_model)
else:
    print("starting fresh!")


print("Reading input")
with open("train.in","r") as f:
    train_in = f.read().split('\n')
    train_in = [eval(i) for i in tqdm(train_in[:-1])]
    print("done train in")
    
print("Reading output")
with open("train.out","r") as f:
    train_out = f.read().split('\n')
    train_out = [eval(i) for i in tqdm(train_out[:-1])]
    print("done train out")

attack_enemy = []
mine_our_planet = []
mine_empty_planet = []

print("balancing data...")
for n, _ in tqdm(enumerate(train_in)):
    input_layer = train_in[n]
    output_layer = train_out[n]

    if output_layer == [1,0,0]:
        attack_enemy.append([input_layer, output_layer])
    elif output_layer == [0,1,0]:
        mine_our_planet.append([input_layer, output_layer])
    elif output_layer == [0,0,1]:
        mine_empty_planet.append([input_layer, output_layer])

print(len(attack_enemy), len(mine_our_planet), len(mine_empty_planet))
shortest = min(len(attack_enemy), len(mine_our_planet), len(mine_empty_planet))

random.shuffle(attack_enemy)
random.shuffle(mine_our_planet)
random.shuffle(mine_empty_planet)

attack_enemy = attack_enemy[:shortest]
mine_our_planet = mine_our_planet[:shortest]
mine_empty_planet = mine_empty_planet[:shortest]

print(len(attack_enemy), len(mine_our_planet), len(mine_empty_planet))

all_choices = attack_enemy + mine_our_planet + mine_empty_planet
random.shuffle(all_choices)

train_in = []
train_out = []

print("rebuilding training data...")
for x,y in tqdm(all_choices):
    train_in.append(x)
    train_out.append(y)

np.save("train_in.npy", train_in)
np.save("train_out.npy", train_out)

train_in = np.load("train_in.npy")
train_out = np.load("train_out.npy")

print('train_in:',len(train_in))

x_train = train_in[:-test_size]
y_train = train_out[:-test_size]

x_test = train_in[-test_size:]
y_test = train_out[-test_size:]

print('Building model...')
if not load_prev_model:
    model = Sequential()
    model.add(Dense(256, input_shape=(len(train_in[0]),)))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(256, input_shape=(256,)))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(len(train_out[0])))
    model.add(Activation('softmax'))

    model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])

history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_split=0.1)

score = model.evaluate(x_test, y_test,
                       batch_size=batch_size, verbose=1)

model.save(out_model)
print("Model saved to:",out_model)
print('Test score:', score[0])
print('Test accuracy:', score[1])

Let's train!

C:\Users\H\Desktop\Halite II >python "model_trainer_3_options.py"
starting fresh!
Reading input
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 173623/173623 [00:16<00:00, 10513.10it/s]
done train in
Reading output
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 173623/173623 [00:01<00:00, 97884.36it/s]
done train out
balancing data...
173623it [00:00, 495445.20it/s]
152883 10836 9904
9904 9904 9904
rebuilding training data...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 29712/29712 [00:00<00:00, 1290247.76it/s]
train_in: 29712
Building model...
Train on 22240 samples, validate on 2472 samples
Epoch 1/10
2018-01-08 09:02:17.175695: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2018-01-08 09:02:17.546411: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.6705
pciBusID: 0000:02:00.0
totalMemory: 11.00GiB freeMemory: 9.10GiB
2018-01-08 09:02:17.546508: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1)
22240/22240 [==============================] - 2s 103us/step - loss: 9.0471 - acc: 0.4180 - val_loss: 7.7675 - val_acc: 0.4822
Epoch 2/10
22240/22240 [==============================] - 1s 31us/step - loss: 8.4197 - acc: 0.4653 - val_loss: 7.7409 - val_acc: 0.5121
Epoch 3/10
22240/22240 [==============================] - 1s 30us/step - loss: 7.9926 - acc: 0.4897 - val_loss: 7.6013 - val_acc: 0.5150
Epoch 4/10
22240/22240 [==============================] - 1s 31us/step - loss: 8.0825 - acc: 0.4867 - val_loss: 7.7277 - val_acc: 0.5125
Epoch 5/10
22240/22240 [==============================] - 1s 31us/step - loss: 7.8882 - acc: 0.5019 - val_loss: 7.3081 - val_acc: 0.5336
Epoch 6/10
22240/22240 [==============================] - 1s 31us/step - loss: 7.7212 - acc: 0.5107 - val_loss: 7.2023 - val_acc: 0.5461
Epoch 7/10
22240/22240 [==============================] - 1s 31us/step - loss: 7.6613 - acc: 0.5156 - val_loss: 7.2052 - val_acc: 0.5445
Epoch 8/10
22240/22240 [==============================] - 1s 31us/step - loss: 7.5114 - acc: 0.5261 - val_loss: 7.2727 - val_acc: 0.5396
Epoch 9/10
22240/22240 [==============================] - 1s 31us/step - loss: 7.5069 - acc: 0.5255 - val_loss: 6.9958 - val_acc: 0.5514
Epoch 10/10
22240/22240 [==============================] - 1s 31us/step - loss: 7.2216 - acc: 0.5409 - val_loss: 6.5766 - val_acc: 0.5793
5000/5000 [==============================] - 0s 13us/step
Model saved to: model_checkpoint_128_batch_10_epochs.h5
Test score: 6.880058638
Test accuracy: 0.5618

As you can see, I did this with only 20K total samples, which really isn't enough, we'd like to have more like 100K+, so let's play some more games and come back. That said, we can see we ended with ~54% accuracy in training and the test accuracy was 56%, so we're relatively confident we didn't over-fit.

Since our data is perfectly balanced, "random" choice would yield 33% accuracy, so 56% accuracy is actually something, but we can probably do better in time. Let's say this is enough, however, how do we go about using our new AI and competing with it? That's what we'll be covering in the next tutorial.

The next tutorial:

Introduction - Halite II 2017 Artificial Intelligence Competition p.1
Modifying Starter Bot - Halite II 2017 Artificial Intelligence Competition p.2
Custom Bot - Halite II 2017 Artificial Intelligence Competition p.3
Deep Learning - Halite II 2017 Artificial Intelligence Competition p.4
Training Data Deep Learning - Halite II 2017 Artificial Intelligence Competition p.5
Training Model Deep Learning - Halite II 2017 Artificial Intelligence Competition p.6
Deploying Model Deep Learning - Halite II 2017 Artificial Intelligence Competition p.7