Doing Math with Neural Networks - Unconventional Neural Networks in Python and Tensorflow p.10




What's going on everyone and welcome to part 10 of our "unconventional" neural networks series. We are going to switch gears now to what I think is the most interesting type of model at the moment, which is the sequence to sequence model. The idea here is that you can make any sequence to any other sequence. We can use this for all sorts of things. For example, my most popular example of this has been the Chatbot tutorial series, along with the implementation of it which is Charles the AI. That chatbot is a bit of a hack on its own, but I am curious about other sequences being mapped to other sequences. While neural networks do appear to understand things similarly to how we do, they are still very different in some of the things that they learn and in some of the ways they appear to process data. One task that has historically been challenging with neural networks has been math. While math is a simple concept for computers naturally to understand and excel at, math presents a decent challenge to a neural network on a machine.

Neural networks tend to do well at non-linear relationships, and mathematics are very linear. That doesn't mean we can't find ways to make it non-linear in attempts to "see" math in different ways.

Why might we have any interest in embarking on this journey though? Machines already can do math just fine. Well, machines can do linear math well, but they tend to have more trouble with more complex types of math. There are many mathematical equations not yet solved. Also, one area where math is used that machines are traditionally pretty poor at solving is encryption. Without the keys, machines mainly use brute force to break encryption. While unlikely, it's still possible that neural networks could instead be used to both generate new forms of encryption, along with breaking current forms.

We're probably not going to be breaking even rudimentary encryption here, but we can at least play around. To do this, I am going to use my and Daniel's Chatbot code. Just in case things have changed over time, I am going to be specifically using the Github project commit: c503d2c. If things are acting differently, it might be because this package has changed, and you might want to download this specific commit.

Grab that with git (recursively!) or download the zip and extract, including grabbing the nmt part too! Make sure you have stuff inside the main package's nmt/nmt/ directory. If not, get it! Next, this is mainly made for doing a chatbot. That said, just like we used a generative model to do things it wasnt necessarily meant for, we can do the same here. This chatbot code is meant to tokenize by word, or subword, but can also be a character-level sequence to sequence model, which is what I am going to shoot for.

When you have the package, do a pip install -r requirements.txt to get the dependencies, and then we're ready to go.

With this library, we map an input to an output. The input is in a train.from file, separated by newlines, and the output goes into train.to. So, for example, we could have:

1+2
2+4
5+22

The above going into train.from, where the train.out file is:

3
6
27

For now, let's start with addition, and see if our model can learn that. To do this, we need a sample file with some training data. To do this, I am going to use the following file:

import random

hm_samples = 300000
max_val = 100000


def generate_pair(action):
    x = random.randrange(1, max_val)
    y = random.randrange(1, max_val)
    if action == 'add':
        result = x+y
        symbol = "+"
    elif action == 'sub':
        result = x-y
        symbol = "-"
    elif action == 'mul':
        result = x*y
        symbol = "*"
    elif action == 'div':
        result = x/y
        symbol = "/"

    str_in = "{}{}{}\n".format(x, symbol, y)
    str_out = "{}\n".format(result)

    return str_in, str_out


def test_gen_pair(method='add'):
    str_in, str_out = generate_pair(method)
    print(str_in)
    print(str_out)


if __name__ == "__main__":
    #test_gen_pair()
    with open("train.from", "a") as fin:
        with open("train.to", "a") as fout:
            for i in range(hm_samples):
                str_in, str_out = generate_pair("add")
                fin.write(str_in)
                fout.write(str_out)

    with open("tst2012.from", "a") as fin1:
        with open("tst2013.from", "a") as fin2:
            with open("tst2012.to", "a") as fout1:
                with open("tst2013.to", "a") as fout2:
                    for i in range(100):
                        str_in, str_out = generate_pair("add")
                        fin1.write(str_in)
                        fin2.write(str_in)
                        fout1.write(str_out)
                        fout2.write(str_out)

Nothing too special here. It just simply creates some samples, converting the input and output to string, and repeat!

Next, after running this, we should get 6 new .from and .to files, which we will new move to the new_data directory, replacing what was there.

Next, let's go into setup/settings.py, and set vocab_size to 11. This will give us all of the individual digits (0-9) along with the single operator that we're using right now (+). If we add more operators, remember to add those to your vocab, or it wont work out too well! Next, we'll set 'use_bpe': False, and 'embedded_detokenizer': False. Next, let's change the epochs line to:

    'epochs': [0.001, 0.001, 0.0005, 0.0005, 0.00025, 0.00025, 0.0001, 0.0001, 0.00001, 0.00001],

Finally, I am going to make this model a 6x128. Once you've got your settings where you want:

python setup/prepare_data.py

You can't run this from within the setup dir, make sure to run from the package's root directory. Once that's done, then we can run:

python train.py, and then... go do something else for a while while it trains!

Once that's done, we can run the inference.py with python inference.py. For a quick test, I actually only did 7k steps, I was curious to see the results. It looks like the model has learned some math, currently seems to get about half of the answers right, and the other half are at least close.

Running a few examples:

> 5212+1224
- 6436 [7.1]
- 6336 [6.6]
- 5436 [6.6]
- 6426 [6.1]
- 5336 [6.1]
- 6536 [6.1]
- 7436 [6.1]
- 7336 [5.6]
- 6326 [5.6]
- 6346 [5.6]
- 6446 [5.6]
- 5426 [5.6]
- 5536 [5.6]
- 5446 [5.6]
- 5326 [5.6]
- 5346 [5.6]
- 6526 [5.6]
- 7426 [5.6]
- 7346 [5.6]
- 7326 [5.6]
> 500+55211
- 55711 [7.25]
- 56711 [6.75]
- 55611 [6.75]
- 54711 [6.25]
- 55721 [6.25]
- 45711 [6.25]
- 55811 [6.25]
- 56611 [5.75]
- 55712 [5.75]
- 46711 [5.75]
- 56721 [5.75]
- 65711 [5.75]
- 56811 [5.75]
- 54611 [5.75]
- 55621 [5.75]
- 57711 [5.75]
- 45611 [5.75]
- 56712 [5.75]
- 54721 [5.75]
- 54811 [5.75]
> 300+4621
- 4821 [7.1]
- 4811 [6.6]
- 5821 [6.6]
- 5811 [6.1]
- 5021 [6.1]
- 5011 [6.1]
- 5921 [6.1]
- 4911 [5.6]
- 4921 [5.6]
- 5721 [5.6]
- 5911 [5.6]
- 5711 [5.6]
- 4721 [5.6]
- 4711 [5.6]
- 5621 [5.6]
- 5611 [5.6]
- 5831 [5.6]
- 4831 [5.6]
- 5031 [5.6]
- 6021 [5.6]

The top answer is the one it chose. This time it was incorrect, but the right answer is at least IN there. With more training, I am confident that we could iron this out and "solve" addition. Let's let this run, and see the results with a fully trained model.

The next tutorial:





  • Generative Model Basics (Character-Level) - Unconventional Neural Networks in Python and Tensorflow p.1
  • Generating Pythonic code with Character Generative Model - Unconventional Neural Networks in Python and Tensorflow p.2
  • Generating with MNIST - Unconventional Neural Networks in Python and Tensorflow p.3
  • Classification Generator Training Attempt - Unconventional Neural Networks in Python and Tensorflow p.4
  • Classification Generator Testing Attempt - Unconventional Neural Networks in Python and Tensorflow p.5
  • Drawing a Number by Request with Generative Model - Unconventional Neural Networks in Python and Tensorflow p.6
  • Deep Dream - Unconventional Neural Networks in Python and Tensorflow p.7
  • Deep Dream Frames - Unconventional Neural Networks in Python and Tensorflow p.8
  • Deep Dream Video - Unconventional Neural Networks in Python and Tensorflow p.9
  • Doing Math with Neural Networks - Unconventional Neural Networks in Python and Tensorflow p.10
  • Doing Math with Neural Networks testing addition results - Unconventional Neural Networks in Python and Tensorflow p.11
  • Complex Math - Unconventional Neural Networks in Python and Tensorflow p.12