The neural network toolbox application demo on the
concerns designing a neural network with 25
(non-optimal) hidden nodes to recognize 26 7-by-5
binary images of the capital letters A-Z. The letters
are downloaded from the function PRPROB ("P"attern
"R"ecognition "PROB"lem) by using the command
[alphabet26 targets] = prprob;
Details can be obtained in the usual way:
edit prprob % WARNING change the name before editing!
The demo can be run by either using the website code
or calling the function APPCR1.
edit appcr1 % WARNING change the name before editing!
There is a final remark about increasing the number of
hidden nodes for the recognition of very noisy images.
Recently, the CSSM thread
"Character Recognition using neural networks"
concerns designing a neural network with 36 (non-
optimal) hidden nodes to recognize 36 7-by-5 binary
images of the 10 digits 0-9 and the 26 capital letters
A-Z. The letters are different than those obtained
from the PRPROB function and, except for the letter A,
appear to be more recognizable by sight.
The reason for the current thread is to emphasize that
although the optimal number of hidden nodes can be
obtained by arduous brute force trial and error, it
is worthwhile to first consider two simpler models.
THE NAIVE CONSTANT MODEL
The Naive Constant Model outputs the average of the
training target examples (independent of input) and,
consequently, obtains a mean-square-error (MSE00)
that is approximately equal to the mean of the 36 target
variances. The importance of MSE00 is that it sets
a scale for the performance of more complicated models
that can be judged by using either the normalized MSE
or the corresponding R-square (R^2) statistic.
NMSE = MSE/MSE00
R2 = 1-NMSE
THE LINEAR MODEL
The Linear Model is usually obtained by QR via the
BACKSLASH operator or REGRESS function. However,
in ill-conditioned cases pseudoinversion via the
PINV function can be used. The model results in the
normalized mean-square-error MSE0 and corresponding
R-square statistic R20. The importance of MSE0 is
that the linear model is a special case of a neural
net with no hidden nodes. Therefore, it can serve
as the 1st candidate in the search for the optimal
number of hidden nodes.
For the 36 character example in the above recent
MSE00 = 0.027778
PctErr00 = 97.222 % (35/36 misclassifications)
MSE0 = 0.0007716
NMSE0 = 0.027777
R20 = 0.97222
PctErr0 = 0 % Perfect Recognition!
When training images are noiseless but test images
are contaminated with Gaussian noise, the correct
recognition rate vs signal-to-noise ratio (dB) is
The corresponding outputs for each character are
Including noisy images in the training set
should improve performance. However, it is
not clear to me at what point hidden nodes
are necessary for improvement.
Hope this helps.
||9/2/2010 12:07:45 PM
This one seems interesting!
9/2/2010 8:14:12 AM