f



Basic questions about Matlab/NNT

Hi!

I am a newbie in case of artificial neural networks. Please answer 
simple ;-)

I need to implement digit classification (OCR) with numbers 0..9 as 
output and approx. 3800 training images. I also have ~1800 test images.

I found the OCR example shipped with Matlab and played with it. It uses 
a feedforward NN with backpropagation (newff).

Here my questions:

How can I determinate the optimal number of training images? If I use 
3800 it takes some time to train. I played with subsets of 200, 400, 
500, 800 images and the results where not linear increasing. Should I 
"re-train" the net with the same training set a few times?

The train function shows its progress in a chart with the epochs on the 
X axis and the performance on the Y axis. What is the meaning and how 
does it relate to the overall "detection rate" of the net when feeding 
with test-data?

I played with a performance goal between 1.0 and 0.5 and found 0.8 to be 
the best for my data-set. Is there a way to calculate the optimum? How 
is it related to the other parameters?

The number of hidden units ;-) I tested using 10, 11, 12, 13 ... 20 
hidden units and depending on the other parameters 13-16 where optimal.
Does this sound reasonable?

Even with (in my eyes) optimal settings I get only a success-rate of 
95%. I test all 1800 test-images against the net and at the end: 
nummer_of_correct_results / total_test_images * 100 = XX%
I know it depends on the image data but: Are there any other - general 
ways or most common mistakes - reasons for this "low" success-rate?

And here my final question ;-) If I try a NN w/o hidden units (I already 
tried newp, gave me 83% success-rate) - which one would be recommendable 
for digit classification?

I hope the questions are not tooo stupid :-)

TIA,
-- 
----------------------------------------------------------------
,yours Thomas Zangl - thomas@tzis.net - http://www.tzis.net/ -
- Freelancer - IT Consulting & Software Development -
Use Y.A.M.C! now! Get it at http://www.borg-kindberg.ac.at/yamc/
0
Thomas
4/29/2006 7:18:34 AM
comp.ai.neural-nets 5773 articles. 2 followers. tomhoo (63) is leader. Post Follow

3 Replies
966 Views

Similar Articles

[PageSpeed] 52

Thomas Zangl wrote:
> Hi!
>
> I am a newbie in case of artificial neural networks. Please answer
> simple ;-)
>
> I need to implement digit classification (OCR) with numbers 0..9 as
> output and approx. 3800 training images. I also have ~1800 test images.
>
> I found the OCR example shipped with Matlab and played with it. It uses
> a feedforward NN with backpropagation (newff).
>
> Here my questions:
>
> How can I determinate the optimal number of training images? If I use
> 3800 it takes some time to train. I played with subsets of 200, 400,
> 500, 800 images and the results where not linear increasing. Should I
> "re-train" the net with the same training set a few times?

Inputs: What and how many?
Outputs: What and how many?
Hidden nodes: How many and why?

 > The train function shows its progress in a chart with the epochs on
the
> X axis and the performance on the Y axis. What is the meaning

Mean-Square-Error of the output nodes.

>and how
> does it relate to the overall "detection rate" of the net when feeding
> with test-data?
>

There is no analytic relationship of classification error rate, a
discontinuous quantity, and MSE (continuous) of class-conditional
posterior probability (acheived with 0/1 targets). Cross Entropy is
another continuous objective function that can be used for
classification. However, it is not supported by MATLAB.

> I played with a performance goal between 1.0 and 0.5 and found 0.8 to be
> the best for my data-set. Is there a way to calculate the optimum? How
> is it related to the other parameters?

My rule of thumb is MSE/variance of targets < 0.01.

> The number of hidden units ;-) I tested using 10, 11, 12, 13 ... 20
> hidden units and depending on the other parameters 13-16 where optimal.
> Does this sound reasonable?

Search archives using

greg-heath Neq Nw H

> Even with (in my eyes) optimal settings I get only a success-rate of
> 95%. I test all 1800 test-images against the net and at the end:
> nummer_of_correct_results / total_test_images * 100 = XX%
> I know it depends on the image data but: Are there any other - general
> ways or most common mistakes - reasons for this "low" success-rate?

Typically, 95% sounds good.

> And here my final question ;-) If I try a NN w/o hidden units (I already
> tried newp, gave me 83% success-rate) - which one would be recommendable
> for digit classification?

newff without hidden nodes is the only one left. Don't expect it to be
any better.

Hope this helps.

Greg
>
> I hope the questions are not tooo stupid :-)
>
> TIA,
> --
> ----------------------------------------------------------------
> ,yours Thomas Zangl - thomas@tzis.net - http://www.tzis.net/ -
> - Freelancer - IT Consulting & Software Development -
> Use Y.A.M.C! now! Get it at http://www.borg-kindberg.ac.at/yamc/

0
Greg
4/30/2006 8:36:11 AM
Greg Heath wrote:

Hi!

> Inputs: What and how many?

The image is represented as an 64 int values enabling 16 colors. After 
transforming they can be displayed as an image wit w=8 and h=8.

So, inputs = 64

> Outputs: What and how many?

Output is the classification, so numberrange is 0...9 (=10 outputs)

> Hidden nodes: How many and why?

As stated below we tried 1 .. 20 hidden units.

>>I played with a performance goal between 1.0 and 0.5 and found 0.8 to be
>>the best for my data-set. Is there a way to calculate the optimum? How
>>is it related to the other parameters?
 >
> My rule of thumb is MSE/variance of targets < 0.01.

We have been using SSE as the performance function...

> Search archives using
> 
> greg-heath Neq Nw H

I get ~ 10 hits, I assume you mean this part:
---- cite ---
For an I-H-O MLP, the number of weight/bias unknowns is

Nw = (I+1)*H+(H+1)*O = O+(I+O+1)*H

and the number of training equations is

Neq = Ntrn*O.
---- cite ---

> Typically, 95% sounds good.

Are there ways to improve this? Tweaks/tricks/pre-processing of test-data?

> newff without hidden nodes is the only one left. Don't expect it to be
> any better.

Ok. I also tried 2 hidden layers but results did not increase. They even 
got a bit worse (~93%).

TIA,
-- 
----------------------------------------------------------------
,yours Thomas Zangl - thomas@tzis.net - http://www.tzis.net/ -
- Freelancer - IT Consulting & Software Development -
Use Y.A.M.C! now! Get it at http://www.borg-kindberg.ac.at/yamc/
0
Thomas
4/30/2006 9:27:43 AM
Thomas Zangl wrote:
> Greg Heath wrote:
>
> Hi!
>
> > Inputs: What and how many?
>
> The image is represented as an 64 int values enabling 16 colors. After
> transforming they can be displayed as an image wit w=8 and h=8.
>
> So, inputs = 64
>
> > Outputs: What and how many?
>
> Output is the classification, so numberrange is 0...9 (=10 outputs)
>
> > Hidden nodes: How many and why?
>
> As stated below we tried 1 .. 20 hidden units.

Nw = 10+(64+10+1)*20 = 1,510
Neq = 9*Ntrn         % Only 9 outputs are independent

Neq >~ 10 *Nw ==> Ntrn >~ 15,100/9 ~ 1,700

Since you are not starved for training data, use 1700 or more.

> >>I played with a performance goal between 1.0 and 0.5 and found 0.8 to be
> >>the best for my data-set. Is there a way to calculate the optimum? How
> >>is it related to the other parameters?
>  >
> > My rule of thumb is MSE/variance of targets < 0.01.
>
> We have been using SSE as the performance function...

MSE = SSE/(Ntrn-Nw)

>
> > Search archives using
> >
> > greg-heath Neq Nw H
>
> I get ~ 10 hits, I assume you mean this part:
> ---- cite ---
> For an I-H-O MLP, the number of weight/bias unknowns is
>
> Nw = (I+1)*H+(H+1)*O = O+(I+O+1)*H
>
> and the number of training equations is
>
> Neq = Ntrn*O.

Acyually O should be replaced by the number of independent outputs:
probably (O-1) in your case.

> ---- cite ---
>
> > Typically, 95% sounds good.
>
> Are there ways to improve this? Tweaks/tricks/pre-processing of test-data?

Search for ensembles and committees.

> > newff without hidden nodes is the only one left. Don't expect it to be
> > any better.
>
> Ok. I also tried 2 hidden layers but results did not increase. They even
> got a bit worse (~93%).

Hope this helps.

Greg

0
Greg
5/1/2006 4:54:45 PM
Reply: