Now that we have our sample data, we're ready to compare it. The method that we're going to use will be very simple, yet you'll be surprised how decent it works.
We will simply go, pixel by pixel, comparing whether or not they are the same. Boom done.
from PIL import Image import numpy as np import time from collections import Counter def whatNumIsThis(filePath): matchedAr = [] loadExamps = open('numArEx.txt','r').read() loadExamps = loadExamps.split('\n') i = Image.open(filePath) iar = np.array(i) iarl = iar.tolist() inQuestion = str(iarl) for eachExample in loadExamps: try: splitEx = eachExample.split('::') currentNum = splitEx[0] currentAr = splitEx[1] eachPixEx = currentAr.split('],') eachPixInQ = inQuestion.split('],') x = 0 while x < len(eachPixEx): if eachPixEx[x] == eachPixInQ[x]: matchedAr.append(int(currentNum)) x+=1 except Exception as e: print(str(e)) print(matchedAr) x = Counter(matchedAr) print(x) print(x[0]) whatNumIsThis('images/test.png')
First we're running against the test image, which is an image of my hand-drawn 2, which is not included in the training set. Running the code, we get our counters list for matches:
Counter({2: 461, 3: 389, 6: 374, 9: 366, 5: 365, 7: 364, 8: 361, 1: 325, 4: 321})
Thus, the prediction is a 2, as it is the highest match. So that was a success. You may point out that the next closest match, a 3, is 389. To this I would point out the least matched option was a 321, so you can see that the scale is not really something like 0-500. It is really probably best though of as a range from 400-500 or so. Anything under 400 is too loose of a match to be considered a confident match.
If you're confused about the function, check out the video, it is explained line by line there.
Next, we can also test this with some other hand-drawn numbers. In the video, I draw a number 3 for example, and get another success. In the next video, we'll take it a step further to visualize the results, and do some more tests.