This is my first whack at using PyBrain for optical character recognition. I am limiting myself to numerical data, since that’s what I have laying around needing to be optically recognized the most. I’m also focusing on extra small, and heavily corrupted data.
Data Preprocessing
Since my PNG images are all different sizes, I decided to vignette them all onto the largest format, which was 10×9 pixels. Here, ds
is a dict that has filenames of PNG images as keys, and then the class label as a value. This dict was created using the image labeling GUI detailed in my previous post. The list raw
is a list of two element lists. The first element is a (three dimensional) NumPy array, and the second element is the class label.
raw = list() for k, v in ds.iteritems(): im = cv2.imread( k ) im = im[:,:,0] raw.append( [ im, v ] )
Here, sz
is a list of image dimensions. The values mh
and mw
are the maximum height and width image dimensions found in the training data.
sz = np.array( [ i[0].shape for i in raw ] ) mh, mw = np.max( sz[:,0] ), np.max( sz[:,1] )
This loop vignettes each image onto a standard size, mh
by mw
, and then normalizes the pixel data from 0 to 255, to 0 to 1. The data is put into the list normed
. Each row of normed
is a list of values between 0 and 1 from the image data, and then a class label.
normed = list() for i in range( len( raw ) ): img, lbl = raw[i] z = np.ones( ( mh, mw ) )*255.0 h, w = img.shape dh = ( mh - h ) / 2 dw = ( mw - w ) / 2 z[dh:dh+h,dw:dw+w] = img z /= 255.0 z = z.ravel() normed.append( list( z ) + [ int( lbl ) ] )
Using PyBrain
Here is the code for using PyBrain. It looks like all of the useful functions are peppered through a bunch of modules. I’m not very familiar with the tool yet, and this is cobbled together from the documentation. First we initialize and populate the data set with the training data. The training data in this example is all of the data except for the last 20 items.
from pybrain.tools.shortcuts import buildNetwork from pybrain.datasets import SupervisedDataSet from pybrain.supervised.trainers import BackpropTrainer import time ## initialize a data set ds = SupervisedDataSet( len( normed[0] ) - 1, 1 ) ## poplate the data set N = len( normed ) for i in range( N-20 ): ds.addSample( normed[i][:-1], normed[i][-1] )
Next we set up a neural network and train it. The 90, 60, 30,...
bit are the numbers of neurons per layer of the neural network. The first layer has 90 neurons, the next layer has 60 neurons, etc. I settled on the output layer having one neuron since I’m looking for a single classification, but maybe that’s not the best idea? I’m still working on it.
## set up the neural network net = buildNetwork( 90, 60, 30, 20, 10, 1, bias=True ) trainer = BackpropTrainer( net, ds ) ## initial time t0 = time.time() ## training for i in range( 500 ): trainer.train() ## fial time t1 = time.time()
This part collects the output. The err
term is used for determining convergence. I ran this bad boy all night and it never converged, so I settled instead on training it over 500 epochs. The accuracy is the percentage of correct classifications of the last 20 items, which the network has not been trained on, based on the class labels.
## error returned by the trainer err = trainer.train() ## determine the accuracy score = 0 for i in range( N-20, N ): p = net.activate( normed[i][:-1] ) res = np.round( p ) if res == normed[i][-1]: score += 1 acc = score / float( 20 ) ## print the output (in minutes), the error, and the accuracy print '{:.2f} {:.5e} {:.2f}'.format( (t1-t0)/60.0, err, acc )
Conclusion
I ended up with 65-70% accuracy, which I think is pretty good based on the dismal quality of my data, and the fact that I did not do any feature extraction. This was essentially a proof of principle exercise to myself, as I did not know if it would work at all on my data. In the future I’d like to look at feature extraction techniques, and further neural network options and topologies.
I’m pretty sure your error was in having just 1 output node/neuron. It should be (in general) a number of output neurons equal to the number of classes, i.e 2.