Welcome to part 6 of the Halite II tutorials, and part 3 of the deep learning with Halite tutorials. In this tutorial, we're going to write a script to train a model based on our winner's training data.
To do this, we're going to be using Keras
, which is a high-level API that sits on top of TensorFlow to make writing our neural networks easier.
To follow this, you will need TensorFlow Installed and Keras
installed (pip install keras
). If you do not have a GPU to run TensorFlow on, then you can just do a pip install tensorflow
. If you do have a TensorFlow-capable GPU (Nvidia GPU with CUDA compute capability of 6 or higher), then you can install the GPU version. Here's a tutorial for installing GPU tensorflow on Linux and here's a tutorial for installing GPU tensorflow on Windows.
To begin:
import keras from keras.models import Sequential from keras.layers import Dense, Dropout, Activation from keras.models import load_model import random from tqdm import tqdm import numpy as np
Let's set some defaults now:
batch_size = 128 epochs = 10 test_size = 5000
Batch size is how many samples to feed through the network at a time. In general, a low batch size will take a very long time to train with, and may not fit at all. A too-high batch size can speed things up, but also result in over-fitting. Epochs measure how many times we fully cycle through the full dataset in training. Somewhere between 1 and 10 is a good number to go with. Test size is how many samples would we like to reserve for out-of-sample testing.
Next, let's define our model names. Many times, you may train a pre-existing model. We do not have one, but we might in the future, then we want to name our current model:
in_model = 'model_checkpoint_{}_batch_{}_epochs.h5'.format(0,0) out_model = 'model_checkpoint_{}_batch_{}_epochs.h5'.format(batch_size, epochs)
Now, if we do want to load a previous model, we'll do that. This is our first time through, so we will set the decision to load a model to false, but let's have the code:
load_prev_model = False if load_prev_model: print("Loading model: ", in_model) model = load_model(in_model) else: print("starting fresh!")
Next, we need to load in the training data:
print("Reading input") with open("train.in","r") as f: train_in = f.read().split('\n') train_in = [eval(i) for i in tqdm(train_in[:-1])] print("done train in") print("Reading output") with open("train.out","r") as f: train_out = f.read().split('\n') train_out = [eval(i) for i in tqdm(train_out[:-1])] print("done train out")
Easy enough, we read in the data, which is a string of the vectors, so we'll use eval
to convert this data. Next, we need to properly balance the training data. In general with training data from the real world, it's not perfectly balanced. What I mean by that is we probably have far more samples of data that translate to "attack" than we do for "mine empty planet," since most of the game is spent attacking other players, and really only a minority of our ships can be miners if the game progresses very far.
The problem with unbalanced data is that the network doesn't know anything about balance, it just wants to fit your data. If 75% of your data says "attack," then the model will learn that the best thing to do always is just to attack, and it will never mine, which will be problematic. Instead, we want to balance all of the data evenly, so let's do that. First, let's build lists of each of the choices:
attack_enemy = [] mine_our_planet = [] mine_empty_planet = [] print("balancing data...") for n, _ in tqdm(enumerate(train_in)): input_layer = train_in[n] output_layer = train_out[n] if output_layer == [1,0,0]: attack_enemy.append([input_layer, output_layer]) elif output_layer == [0,1,0]: mine_our_planet.append([input_layer, output_layer]) elif output_layer == [0,0,1]: mine_empty_planet.append([input_layer, output_layer])
Now, just for debugging purposes, let's check the lengths:
print(len(attack_enemy), len(mine_our_planet), len(mine_empty_planet))
Now let's grab whatever the shortest one is:
shortest = min(len(attack_enemy), len(mine_our_planet), len(mine_empty_planet))
Next, let's shuffle all of these:
random.shuffle(attack_enemy) random.shuffle(mine_our_planet) random.shuffle(mine_empty_planet)
Finally, let's trim all of these to be as long as the shortest one, so we're evenly balanced:
attack_enemy = attack_enemy[:shortest] mine_our_planet = mine_our_planet[:shortest] mine_empty_planet = mine_empty_planet[:shortest]
Now let's check lengths:
print(len(attack_enemy), len(mine_our_planet), len(mine_empty_planet))
Let's add them all together, and shuffle them now:
all_choices = attack_enemy + mine_our_planet + mine_empty_planet random.shuffle(all_choices)
Great, now we have balanced and shuffled training data. We want to shuffle the data so we don't feed large batches of identically-labeled data, which would be very confusing to the machine.
Now, because machine learning classifiers take input and produce output, or, when training, take input and attempt to fit to output, we need to re-split up this data:
train_in = [] train_out = [] print("rebuilding training data...") for x,y in tqdm(all_choices): train_in.append(x) train_out.append(y)
Now, because you may want to try various models, and this training data can be quite large, let's save this training data so we don't necessarily need to perform all these operations if we just want to try more epochs or a different shaped model:
np.save("train_in.npy", train_in) np.save("train_out.npy", train_out)
Now, everything up to this point can be skipped with:
train_in = np.load("train_in.npy") train_out = np.load("train_out.npy")
Alrighty, let's split up the data into training and testing data. The training data will be fit by the model, and the testing data will be tested against. The hope is that the model's "accuracy" matches the "testing accuracy" pretty closely. If the model's accuracy is something like 90%, but the testing accuracy is 60%, this tells us that we've almost certainly over-fit the data in training.
x_train = train_in[:-test_size] y_train = train_out[:-test_size] x_test = train_in[-test_size:] y_test = train_out[-test_size:]
Now, if we didn't load a model, then we need to create one:
print('Building model...') if not load_prev_model: model = Sequential() model.add(Dense(256, input_shape=(len(train_in[0]),))) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(256, input_shape=(256,))) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(len(train_out[0]))) model.add(Activation('softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
In this case, we're building a 2x256 hidden-layered neural network with a 50% dropout between the layers, with rectified linear activation and the adam optimizer. If you do not know what any of that means, and would like to learn more, check out the deep learning tutorials on this website.
Now we fit (train) the model:
history = model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_split=0.1)
While fitting, we will just see the model's "accuracy" in accordance with the training data, which is not "out of sample" (meaning it's not data that wasn't used in training). Once we've fully trained the model, we should test it with data that definitely wasn't used for training:
score = model.evaluate(x_test, y_test, batch_size=batch_size, verbose=1)
Now, we'll save the model and output some stats:
model.save(out_model) print("Model saved to:",out_model) print('Test score:', score[0]) print('Test accuracy:', score[1])
Full code:
import keras from keras.models import Sequential from keras.layers import Dense, Dropout, Activation from keras.models import load_model import random from tqdm import tqdm import numpy as np batch_size = 128 epochs = 10 test_size = 5000 in_model = 'model_checkpoint_{}_batch_{}_epochs.h5'.format(0,0) out_model = 'model_checkpoint_{}_batch_{}_epochs.h5'.format(batch_size, epochs) load_prev_model = False if load_prev_model: print("Loading model: ", in_model) model = load_model(in_model) else: print("starting fresh!") print("Reading input") with open("train.in","r") as f: train_in = f.read().split('\n') train_in = [eval(i) for i in tqdm(train_in[:-1])] print("done train in") print("Reading output") with open("train.out","r") as f: train_out = f.read().split('\n') train_out = [eval(i) for i in tqdm(train_out[:-1])] print("done train out") attack_enemy = [] mine_our_planet = [] mine_empty_planet = [] print("balancing data...") for n, _ in tqdm(enumerate(train_in)): input_layer = train_in[n] output_layer = train_out[n] if output_layer == [1,0,0]: attack_enemy.append([input_layer, output_layer]) elif output_layer == [0,1,0]: mine_our_planet.append([input_layer, output_layer]) elif output_layer == [0,0,1]: mine_empty_planet.append([input_layer, output_layer]) print(len(attack_enemy), len(mine_our_planet), len(mine_empty_planet)) shortest = min(len(attack_enemy), len(mine_our_planet), len(mine_empty_planet)) random.shuffle(attack_enemy) random.shuffle(mine_our_planet) random.shuffle(mine_empty_planet) attack_enemy = attack_enemy[:shortest] mine_our_planet = mine_our_planet[:shortest] mine_empty_planet = mine_empty_planet[:shortest] print(len(attack_enemy), len(mine_our_planet), len(mine_empty_planet)) all_choices = attack_enemy + mine_our_planet + mine_empty_planet random.shuffle(all_choices) train_in = [] train_out = [] print("rebuilding training data...") for x,y in tqdm(all_choices): train_in.append(x) train_out.append(y) np.save("train_in.npy", train_in) np.save("train_out.npy", train_out) train_in = np.load("train_in.npy") train_out = np.load("train_out.npy") print('train_in:',len(train_in)) x_train = train_in[:-test_size] y_train = train_out[:-test_size] x_test = train_in[-test_size:] y_test = train_out[-test_size:] print('Building model...') if not load_prev_model: model = Sequential() model.add(Dense(256, input_shape=(len(train_in[0]),))) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(256, input_shape=(256,))) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(len(train_out[0]))) model.add(Activation('softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) history = model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_split=0.1) score = model.evaluate(x_test, y_test, batch_size=batch_size, verbose=1) model.save(out_model) print("Model saved to:",out_model) print('Test score:', score[0]) print('Test accuracy:', score[1])
Let's train!
C:\Users\H\Desktop\Halite II >python "model_trainer_3_options.py" starting fresh! Reading input 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 173623/173623 [00:16<00:00, 10513.10it/s] done train in Reading output 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 173623/173623 [00:01<00:00, 97884.36it/s] done train out balancing data... 173623it [00:00, 495445.20it/s] 152883 10836 9904 9904 9904 9904 rebuilding training data... 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 29712/29712 [00:00<00:00, 1290247.76it/s] train_in: 29712 Building model... Train on 22240 samples, validate on 2472 samples Epoch 1/10 2018-01-08 09:02:17.175695: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2 2018-01-08 09:02:17.546411: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Found device 0 with properties: name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.6705 pciBusID: 0000:02:00.0 totalMemory: 11.00GiB freeMemory: 9.10GiB 2018-01-08 09:02:17.546508: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1) 22240/22240 [==============================] - 2s 103us/step - loss: 9.0471 - acc: 0.4180 - val_loss: 7.7675 - val_acc: 0.4822 Epoch 2/10 22240/22240 [==============================] - 1s 31us/step - loss: 8.4197 - acc: 0.4653 - val_loss: 7.7409 - val_acc: 0.5121 Epoch 3/10 22240/22240 [==============================] - 1s 30us/step - loss: 7.9926 - acc: 0.4897 - val_loss: 7.6013 - val_acc: 0.5150 Epoch 4/10 22240/22240 [==============================] - 1s 31us/step - loss: 8.0825 - acc: 0.4867 - val_loss: 7.7277 - val_acc: 0.5125 Epoch 5/10 22240/22240 [==============================] - 1s 31us/step - loss: 7.8882 - acc: 0.5019 - val_loss: 7.3081 - val_acc: 0.5336 Epoch 6/10 22240/22240 [==============================] - 1s 31us/step - loss: 7.7212 - acc: 0.5107 - val_loss: 7.2023 - val_acc: 0.5461 Epoch 7/10 22240/22240 [==============================] - 1s 31us/step - loss: 7.6613 - acc: 0.5156 - val_loss: 7.2052 - val_acc: 0.5445 Epoch 8/10 22240/22240 [==============================] - 1s 31us/step - loss: 7.5114 - acc: 0.5261 - val_loss: 7.2727 - val_acc: 0.5396 Epoch 9/10 22240/22240 [==============================] - 1s 31us/step - loss: 7.5069 - acc: 0.5255 - val_loss: 6.9958 - val_acc: 0.5514 Epoch 10/10 22240/22240 [==============================] - 1s 31us/step - loss: 7.2216 - acc: 0.5409 - val_loss: 6.5766 - val_acc: 0.5793 5000/5000 [==============================] - 0s 13us/step Model saved to: model_checkpoint_128_batch_10_epochs.h5 Test score: 6.880058638 Test accuracy: 0.5618
As you can see, I did this with only 20K total samples, which really isn't enough, we'd like to have more like 100K+, so let's play some more games and come back. That said, we can see we ended with ~54% accuracy in training and the test accuracy was 56%, so we're relatively confident we didn't over-fit.
Since our data is perfectly balanced, "random" choice would yield 33% accuracy, so 56% accuracy is actually something, but we can probably do better in time. Let's say this is enough, however, how do we go about using our new AI and competing with it? That's what we'll be covering in the next tutorial.