Thresholding




Now we're ready to discuss "Thresholding."

The idea of thresholding is to simplify the image. Some people particularly like the visual effect as well, but we're interested in the simplifying aspect. An issue arises when we're trying to identify characters, shapes, objects, whatever, because there is a massive list of colors. Anything complex, to be analyzed, needs to be broken down to the most basic parts. With thresholding, we can look at an image, analyze the "average" color, and turn that "average" into the threshold between white or black. Some thresholding wont go all of the way to either full black or white, there will be some gradient, but, for our basic purposes, we want to go all of the way!

Let's see an example of an image that we want to threshold.

In your script, instead of dot.png, let's open a number example: "i = Image.open('images/numbers/y0.4.png')"

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt

i = Image.open('images/numbers/y0.4.png')

iar = np.asarray(i)


plt.imshow(iar)
print(iar)
plt.show()	  
	  

We get the following output image:


A single row of data now looks like:

 [[255 242   0 255]
  [255 242   0 255]
  [ 63  72 204 255]
  [ 63  72 204 255]
  [ 63  72 204 255]
  [ 63  72 204 255]
  [255 242   0 255]
  [255 242   0 255]]

Now let's look at another version of this zero, using "i = Image.open('images/numbers/0.4.png')"

This is the same zero, just different colors. The resulting image is:


A sample row of pixels from this example is:

[[255 255 255 255]
  [  0   0   0 255]
  [  0   0   0 255]
  [  0   0   0 255]
  [  0   0   0 255]
  [  0   0   0 255]
  [  0   0   0 255]
  [255 255 255 255]]

Even just for us, we can start to make out and recognize what the row of pixels looks like in the 2nd example with just 0's or 255's. The same is going to be true for data analysis. We want to simplify everything to 0 or 255 if we can, and we can do this with thresholding.

We can even do this with images that are made intentionally to be hard to read. For example, use "i = Image.open('images/numbers/y0.5.png')"


Some people make the mistake, adding captcha-like things to their websites that do stuff like this, not realizing it only fools other humans, not computers and programs that are made to read characters.

Right now, we don't have anything to solve this challenge for us, but it's not too hard of a challenge to solve at the basic level, so that's what we'll be doing next.

The next tutorial:





  • Introduction and Dependencies
  • Understanding Pixel Arrays
  • More Pixel Arrays
  • Graphing our images in Matplotlib
  • Thresholding
  • Thresholding Function
  • Thresholding Logic
  • Saving our Data For Training and Testing
  • Basic Testing
  • Testing, visualization, and moving forward