Doing Math with Neural Networks testing addition results - Unconventional Neural Networks in Python and Tensorflow p.11




Hello and welcome to part 11 of the Unconventional Neural Networks series, here, we're going to run through the results from the neural network that does addition, and then we'll go over making this even more advanced.

In the previous tutorial, we checked out some of the results, but let's go ahead and show a bit more. As the model trained, I personally just manually appended the step number do the end of the file. So, for output_dev on step 15,000, I would just call that output_dev15k. From here, I created a script that would automatically check the output_dev output, comparing it to the real answer and whatever the model answered. Here's a basic version of that:

with open("data/tst2012.from", "r") as f:
    model_in = f.read().split('\n')

with open("model/output_dev15k", "r") as f:
    model_out = f.read().split('\n')

with open("data/tst2012.to", "r") as f:
    actual_out = f.read().split('\n')


correct = 0
total = 0
for i, _ in enumerate(model_out[:-1]):
    print("Input: {}. Desired Out: {} || Model out: {}".format(model_in[i].replace(" ", ""),
                                                               actual_out[i].replace(" ", ""),
                                                               model_out[i].replace(" ", "")))

    if actual_out[i] == model_out[i]:
        correct += 1
        print("YAAAAAAAAAYY!!!!")
    total += 1

print(correct/total)

At 15K steps, the model is already 1.2% accurate, which isn't all that bad, considering the type of model that this is, and the complexity of the challenge (at least for a neural network). Regardless, I continued training overnight until the model was complete, saving the outputs along the way as best I could, but the last ~150K steps occurred while I slept, so I didn't get all of those.

As time went on though, I wanted another metric besides loss and total accuracy to see how training was going. With something like math, the final accuracy is certainly a decent metric, but, another decent metric is "how wrong" something is, kind of like how loss works, only a bit more applicable to this exact problem. I wondered about tracking the absolute difference between the right answer and the predicted answers, to see if that metric also got better over time. Even if a model never actually got even 1 answer right, we'd still likely agree that it was at least learning if it was getting closer and closer to the right answers.

So then I made some modifications to the code:

with open("data/tst2012.from","r") as f:
    model_in = f.read().split('\n')

with open("model/output_dev25k","r") as f:
    model_out = f.read().split('\n')

with open("data/tst2012.to","r") as f:
    actual_out = f.read().split('\n')



correct = 0
total = 0
total_difference = 0
for i, _ in enumerate(model_out[:-1]):
    print("Input: {}. Desired Out: {} || Model out: {}".format(model_in[i].replace(" ",""),
                                                               actual_out[i].replace(" ",""),
                                                               model_out[i].replace(" ","")))

    total_difference += abs(int(actual_out[i].replace(" ","")) - int(model_out[i].replace(" ","")))

    if actual_out[i] == model_out[i]:
        correct+=1
        print("YAAAAAAAAAYY!!!!")
    total +=1

print(correct/total)
print(total_difference)

This time it just prints the total difference too.

At 15K steps, the total accuracy was 1.2% and the total difference was 46,761. At 25K steps, the accuracy was actually 0.4%, so less than at 15K, but the total difference was decreated to 21,513. Even though the accuracy at the end went down, the overall "closeness" to the right answers improved.

By 50k steps, accuracy was 9.8%, and the total difference was only 2044. Big improvements!

Finally, I came up with a 3rd version that tracked these changes over time:

with open("data/tst2012.from","r") as f:
    model_in = f.read().split('\n')

with open("data/tst2012.to","r") as f:
    actual_out = f.read().split('\n')


versions = [5, 10, 15, 20, 25, 30, 33, 39, 44, 50, 55, 62, 85, 90, 98, 103, 257]


for v in versions:

    with open("model/output_dev{}k".format(v),"r") as f:
        model_out = f.read().split('\n')

    correct = 0
    total = 0
    total_difference = 0
    for i, _ in enumerate(model_out[:-1]):

        total_difference += abs(int(actual_out[i].replace(" ","")) - int(model_out[i].replace(" ","")))

        if actual_out[i] == model_out[i]:
            correct+=1
        total +=1

    print("{}K Correct rate: {}. Sum of differences: {}".format(v, correct/total, total_difference))

The versions list is just all of the step #s (in thousands), that I had saved. Then, we print out the accuracy and then the total difference.

5K Correct rate: 0.0. Sum of differences: 394562
10K Correct rate: 0.002. Sum of differences: 47611
15K Correct rate: 0.012. Sum of differences: 46761
20K Correct rate: 0.002. Sum of differences: 18832
25K Correct rate: 0.004. Sum of differences: 21513
30K Correct rate: 0.026. Sum of differences: 8705
33K Correct rate: 0.036. Sum of differences: 6113
39K Correct rate: 0.038. Sum of differences: 13964
44K Correct rate: 0.068. Sum of differences: 12757
50K Correct rate: 0.098. Sum of differences: 2044
55K Correct rate: 0.088. Sum of differences: 1799
62K Correct rate: 0.168. Sum of differences: 2169
85K Correct rate: 0.434. Sum of differences: 434
90K Correct rate: 0.476. Sum of differences: 405
98K Correct rate: 0.524. Sum of differences: 356
103K Correct rate: 0.526. Sum of differences: 354
257K Correct rate: 1.0. Sum of differences: 0

By the end of training, this model learned to do addition with 100% accuracy. If you watch the video, you see I got somewhat surprised by the inference not having perfect accuracy. The inference in this case was custom-made for a chatbot, with after-the-fact scoring mechanisms in place. One of the main things is that the scoring prefers longer responses, which is actually why the smaller numbers seemingly werent doing as well. A custom inference script could be made for the true, raw, output, using the same code as what generates the output_dev files, for example. Anyway, pretty cool! We learned addition!

Want the model? I've uploaded it here: NMT-Addition Model 257K steps, which also contains the settings.

So if we can do addition, what about all math symbols? So division, subtraction, addition, and multiplication, all at once?

Here's a data generation script for doing all 4 types of operations:

import random

hm_samples = 10000000
max_val = 100000

operators = ['add','sub','mul','div']

def generate_pair(action):
    x = random.randrange(1, max_val)
    y = random.randrange(1, max_val)
    if action == 'add':
        result = x+y
        symbol = "+"
    elif action == 'sub':
        result = x-y
        symbol = "-"
    elif action == 'mul':
        result = x*y
        symbol = "*"
    elif action == 'div':
        result = round(x/y,7)
        symbol = "/"

    str_in = "{}{}{}\n".format(x, symbol, y)
    str_out = "{}\n".format(result)

    return str_in, str_out


def test_gen_pair(method='sub'):
    str_in, str_out = generate_pair(method)
    print(str_in)
    print(str_out)


if __name__ == "__main__":
    #test_gen_pair()
    with open("train.from", "a") as fin:
        with open("train.to", "a") as fout:
            for i in range(hm_samples):
                str_in, str_out = generate_pair(random.choice(operators))
                fin.write(str_in)
                fout.write(str_out)

    with open("tst2012.from", "a") as fin1:
        with open("tst2013.from", "a") as fin2:
            with open("tst2012.to", "a") as fout1:
                with open("tst2013.to", "a") as fout2:
                    for i in range(500):
                        str_in, str_out = generate_pair(random.choice(operators))
                        fin1.write(str_in)
                        fin2.write(str_in)
                        fout1.write(str_out)
                        fout2.write(str_out)

The above will produce equations like:

78049-1609
82342-60624
83188*70507
4988+18198
21562/25607
24494/2506
2305-7721
45157*60121
50226+31208
62895+94793
94956+18861
59858+53243
70692+20065
8614/47356

I am curious about these results, but I am *also* curious about even MORE complex math. Since the model was able to learn addition, I am confident that the multi-operator will probably work decently-well. I still want to run and test the above, but...what about making waaaaaaaay more complex operations? Here's the complex-math code:

import random
from collections import defaultdict


hm_test = 500
hm_samples = 10000000+hm_test

max_val = 100000
max_number_of_nums = 10
operators = ["+", "-", "*", "/"]

equations = {}

while len(equations) < hm_samples:
    nums = [random.randrange(1,max_val) for _ in range(random.randrange(2,max_number_of_nums))]

    number_of_parenthesis = random.randrange(0, min(4, len(nums)-2)) if len(nums) > 2 else 0
    opening_parenthesis = defaultdict(lambda: 0)
    closing_parenthesis = defaultdict(lambda: 0)
    for _ in range(number_of_parenthesis):
        opening_parenthesis_position = random.randrange(0, len(nums)-1)
        if opening_parenthesis[opening_parenthesis_position] > 0 and opening_parenthesis[opening_parenthesis_position] + 1 in closing_parenthesis.values():
            continue
        opening_parenthesis[opening_parenthesis_position] += 1
        closing_parenthesis_position = random.randrange(opening_parenthesis_position + 1, len(nums))
        if closing_parenthesis[closing_parenthesis_position] > 0 and closing_parenthesis[closing_parenthesis_position] + 1 in opening_parenthesis.values():
            opening_parenthesis[opening_parenthesis_position] -= 1
            continue
        closing_parenthesis[closing_parenthesis_position] += 1

    init_str = ''

    while opening_parenthesis[0] > 0 and closing_parenthesis[len(nums)-1] > 0:
        opening_parenthesis[0] -= 1
        closing_parenthesis[len(nums)-1] -= 1

    for index, num in enumerate(nums):

        while opening_parenthesis[index] > 0 and closing_parenthesis[index] > 0:
            opening_parenthesis[index] -= 1
            closing_parenthesis[index] -= 1

        operator = random.choice(operators) if init_str != '' else ''
        init_str += "{}{}{}{}".format(operator, '('*opening_parenthesis[index], str(num), ')'*closing_parenthesis[index])

    try:
        equations[init_str] = eval(init_str)
    except:
        pass

#print('\n'.join([k + ' = ' + str(v) for k, v in equations.items()]))
with open("train.from", "a") as fin:
    with open("train.to", "a") as fout:
        for k, v in list(equations.items())[:-hm_test]:
            fin.write(k)
            fin.write('\n')
            fout.write(str(v))
            fout.write('\n')


with open("tst2012.from", "a") as fin1:
    with open("tst2013.from", "a") as fin2:
        with open("tst2012.to", "a") as fout1:
            with open("tst2013.to", "a") as fout2:

                for k, v in list(equations.items())[-hm_test:]:
                    fin1.write(k)
                    fin1.write('\n')
                    fin2.write(k)
                    fin2.write('\n')
                    fout1.write(str(v))
                    fout1.write('\n')
                    fout2.write(str(v))
                    fout2.write('\n')


This makes equations like:

63166/21707+25193-26327+14443*20117*67066/91296
22564/(15291+65142*83720*6457+91001/70325+9577)
77861+((43454-88314*(78299/77643)+40734)/61134/46151)
90584+26054+54674
91680/(49369+(99777-91774-(1089-58896/99825*83470)/42034))
23831*58422+51593+55339
51065+96120*50507
82385*54087/45899
52283-(37808*86291+25851)*62242
58635*72485
80418/87375*(71408-38976)
52734*35731*80873-5370+89899/64551
15100-9067/51953/49726
34087+89287
25126*90947*43776
52241/78092-54404
84155*24269+61062
34993-29484/13714/98436
39590*33244+(48665/11603*45145*44756)-(17328+35983)

Solving equations like that, or even getting close, is a pretty cool challenge. Let's see how we do!

The next tutorial:





  • Generative Model Basics (Character-Level) - Unconventional Neural Networks in Python and Tensorflow p.1
  • Generating Pythonic code with Character Generative Model - Unconventional Neural Networks in Python and Tensorflow p.2
  • Generating with MNIST - Unconventional Neural Networks in Python and Tensorflow p.3
  • Classification Generator Training Attempt - Unconventional Neural Networks in Python and Tensorflow p.4
  • Classification Generator Testing Attempt - Unconventional Neural Networks in Python and Tensorflow p.5
  • Drawing a Number by Request with Generative Model - Unconventional Neural Networks in Python and Tensorflow p.6
  • Deep Dream - Unconventional Neural Networks in Python and Tensorflow p.7
  • Deep Dream Frames - Unconventional Neural Networks in Python and Tensorflow p.8
  • Deep Dream Video - Unconventional Neural Networks in Python and Tensorflow p.9
  • Doing Math with Neural Networks - Unconventional Neural Networks in Python and Tensorflow p.10
  • Doing Math with Neural Networks testing addition results - Unconventional Neural Networks in Python and Tensorflow p.11
  • Complex Math - Unconventional Neural Networks in Python and Tensorflow p.12