MLP Binary classification issue

  • Follow


Hello,

I'm trying to develop a NN that will take a linear combination of RGB
values of an image and determine whether or not  the image is an eye
or not.

For my training data I have divided each object into 4 sections and
then compute 3 values for each sections giving me a 12 features for my
input. I then have a list of which images are eyes and compare them to
what my feature vector is and set my target to 0.9 if the image was an
eye and 0.1 if the image isn't.

My training data is probably 75% negatives and 25% positives which I
think may cause a problem.

The structure of my NN is 12 inputs, 18 neurons in a single hidden
layer and 1 output. I am using the logistic activation function for
all my neurons.

I've tried to model this neural network both within Matlab and C++ and
get incorrect results for both.

Within Matlab, using the netlab toolbox when I use the following code

net = mlp(12, 18, 1, 'logistic', 0.01)
[net, options] = netopt(net, options, tdata, ttarget, 'scg');

I get this output
Cycle    1  Error 5870.086642  Scale 1.000000e+000
....
Cycle   50  Error 5522.258553  Scale 2.384186e-007
....
Cycle  100  Error 5468.619200  Scale 9.313226e-010

And in C++ I have errors I believe with my implementation but seeing
how the matlab model isn't working I'm cautious to spend more time
refining the C++ model.

Thanks

0
Reply Chris 10/24/2007 4:07:12 PM

Chris <chris.olekas@gmail.com> wrote in news:1193242032.117404.39090
@y27g2000pre.googlegroups.com:

> Hello,
> 
> I'm trying to develop a NN that will take a linear combination of RGB
> values of an image and determine whether or not  the image is an eye
> or not.
> 
> For my training data I have divided each object into 4 sections and
> then compute 3 values for each sections giving me a 12 features for my
> input. I then have a list of which images are eyes and compare them to
> what my feature vector is and set my target to 0.9 if the image was an
> eye and 0.1 if the image isn't.
> 
> My training data is probably 75% negatives and 25% positives which I
> think may cause a problem.
> 
> The structure of my NN is 12 inputs, 18 neurons in a single hidden
> layer and 1 output. I am using the logistic activation function for
> all my neurons.
> 
> I've tried to model this neural network both within Matlab and C++ and
> get incorrect results for both.
> 
> Within Matlab, using the netlab toolbox when I use the following code
> 
> net = mlp(12, 18, 1, 'logistic', 0.01)
> [net, options] = netopt(net, options, tdata, ttarget, 'scg');
> 
> I get this output
> Cycle    1  Error 5870.086642  Scale 1.000000e+000
> ...
> Cycle   50  Error 5522.258553  Scale 2.384186e-007
> ...
> Cycle  100  Error 5468.619200  Scale 9.313226e-010
> 
> And in C++ I have errors I believe with my implementation but seeing
> how the matlab model isn't working I'm cautious to spend more time
> refining the C++ model.
> 
> Thanks
> 
> 


There might be a problem with your data, i.e., the features are not descriptive enough. There is also a 
chance your NN model is too large for your training dataset. Also, take a look at the thread regarding 
training with unbalanced data. If the data are ok, the NN should "move" towards a solution anyway.


-- 
Harris
0
Reply Harris 10/24/2007 11:38:45 PM


Chris wrote:
> Hello,
> 
> I'm trying to develop a NN that will take a linear combination of RGB
> values of an image and determine whether or not  the image is an eye
> or not.
> 
> For my training data I have divided each object into 4 sections and
> then compute 3 values for each sections giving me a 12 features for my
> input. I then have a list of which images are eyes and compare them to
> what my feature vector is and set my target to 0.9 if the image was an
> eye and 0.1 if the image isn't.

Three values? R, G, B?

And you average over four regions --- hence in effect four pixels?

It would be interesting to see the raw images and then the feature data.

 From where did you get the images; what I'm getting at is did you 
capture the images yourself under controlled lighting conditions?

The first thing I'd try is convert the images to monochrome and somehow 
normalise them to range [0, 1] (assuming float / double); what I'm 
getting at here is a sort of radiometric calibration, something very 
difficult to do for colour data. And then extract the 
average-over-regions features. Maybe go for 3 x 3 (9 features) or 4 x 
4,. It would be nice to have a look at those 3 x 3 or 4 x 4 images.

It would also be nice to do a bit of visualisation of the feature space; 
something simple like PCA and see what the first two components reveal. 
I know PCS is linear and MLP with hidden layer is non-linear, but ... 
still nice to have a look.

Another bit of exploration would be to cluster the data into five or ten 
classes, and look at the comparability of true class label and cluster 
class label.
> 
> My training data is probably 75% negatives and 25% positives which I
> think may cause a problem.
> 
> The structure of my NN is 12 inputs, 18 neurons in a single hidden
> layer and 1 output. I am using the logistic activation function for
> all my neurons.
> 
> I've tried to model this neural network both within Matlab and C++ and
> get incorrect results for both.


> 
> Within Matlab, using the netlab toolbox when I use the following code
> 
> net = mlp(12, 18, 1, 'logistic', 0.01)
> [net, options] = netopt(net, options, tdata, ttarget, 'scg');
> 
> I get this output
> Cycle    1  Error 5870.086642  Scale 1.000000e+000
> ...
> Cycle   50  Error 5522.258553  Scale 2.384186e-007
> ...
> Cycle  100  Error 5468.619200  Scale 9.313226e-010
> 

I don't know what those numbers mean.

> And in C++ I have errors I believe with my implementation but seeing
> how the matlab model isn't working I'm cautious to spend more time
> refining the C++ model.
> 

Yes, don't even consider coding an MLP in C++ until you get some hope 
from your Matlab experiments.

And then, don't try to code it, unless you have to.

Best regards,

Jon C.




0
Reply Jonathan 10/25/2007 5:43:42 PM

On Oct 24, 12:07 pm, Chris <chris.ole...@gmail.com> wrote:
> Hello,
>
> I'm trying to develop a NN that will take a linear combination of RGB
> values of an image and determine whether or not  the image is an eye
> or not.
>
> For my training data I have divided each object into 4 sections and
> then compute 3 values for each sections giving me a 12 features for my
> input. I then have a list of which images are eyes and compare them to
> what my feature vector is and set my target to 0.9 if the image was an
> eye and 0.1 if the image isn't.
>
> My training data is probably 75% negatives and 25% positives which I
> think may cause a problem.
>
> The structure of my NN is 12 inputs, 18 neurons in a single hidden
> layer and 1 output. I am using the logistic activation function for
> all my neurons.
>
> I've tried to model this neural network both within Matlab and C++ and
> get incorrect results for both.
>
> Within Matlab, using the netlab toolbox when I use the following code
>
> net = mlp(12, 18, 1, 'logistic', 0.01)
> [net, options] = netopt(net, options, tdata, ttarget, 'scg');
>
> I get this output
> Cycle    1  Error 5870.086642  Scale 1.000000e+000
> ...
> Cycle   50  Error 5522.258553  Scale 2.384186e-007
> ...
> Cycle  100  Error 5468.619200  Scale 9.313226e-010
>
> And in C++ I have errors I believe with my implementation but seeing
> how the matlab model isn't working I'm cautious to spend more time
> refining the C++ model.

I don't use netlab so I don't know what "error" and "scale" mean.

If you would explain, maybe we could help you more.

Hope this helps.

Greg


0
Reply Greg 10/25/2007 7:47:02 PM

On 25 Oct, 20:47, Greg Heath <he...@alumni.brown.edu> wrote:
> On Oct 24, 12:07 pm, Chris <chris.ole...@gmail.com> wrote:
>
>
>
> > Hello,
>
> > I'm trying to develop a NN that will take a linear combination of RGB
> > values of an image and determine whether or not  the image is an eye
> > or not.
>
> > For my training data I have divided each object into 4 sections and
> > then compute 3 values for each sections giving me a 12 features for my
> > input. I then have a list of which images are eyes and compare them to
> > what my feature vector is and set my target to 0.9 if the image was an
> > eye and 0.1 if the image isn't.
>
> > My training data is probably 75% negatives and 25% positives which I
> > think may cause a problem.
>
> > The structure of my NN is 12 inputs, 18 neurons in a single hidden
> > layer and 1 output. I am using the logistic activation function for
> > all my neurons.
>
> > I've tried to model this neural network both within Matlab and C++ and
> > get incorrect results for both.
>
> > Within Matlab, using the netlab toolbox when I use the following code
>
> > net = mlp(12, 18, 1, 'logistic', 0.01)
> > [net, options] = netopt(net, options, tdata, ttarget, 'scg');
>
> > I get this output
> > Cycle    1  Error 5870.086642  Scale 1.000000e+000
> > ...
> > Cycle   50  Error 5522.258553  Scale 2.384186e-007
> > ...
> > Cycle  100  Error 5468.619200  Scale 9.313226e-010
>
> > And in C++ I have errors I believe with my implementation but seeing
> > how the matlab model isn't working I'm cautious to spend more time
> > refining the C++ model.
>
> I don't use netlab so I don't know what "error" and "scale" mean.
>
> If you would explain, maybe we could help you more.
>
> Hope this helps.
>
> Greg

netopt uses scaled conjugate gradient descent, so IIRC the 'scale' is
an indication of "trust", a value of 1 means "gradient descent" a
value close to zero is a one-dimensional Newton step, a small value is
an indication that all is well.  The error is not the
misclassification rate, but the cross-entropy , i.e. the negative log-
likelihood of the data.  The messages from netopt looks fine (although
the model appears not to have converged yet, keep training until the
numbers stop changing).

What accuracy do you get on the test set with the NETLAB model?

The model has a constant reularisation of 0.01 on all of the weights,
this *may* be too strong depending on the nature of the dataset.  You
will need to set the regularisation parameter via e.g. cross-
validation, or use the evidence framework (see the demos).

Lastly, it is a bad idea to use targets of 0.1 and 0.9 as you are
basically telling the network "I am 90% sure this is not an eye" and
"I am 90% sure that this is an eye" respectively, where in fact you
are probably 100% certain one way of the other.  The 0.1 and 0.9 is an
old heuristic to prevent the output saturating, but NETLAB is a good
bit of kit and is written well enough for this not to be a real
problem.

0
Reply Gavin 10/25/2007 8:23:11 PM

> I'm trying to develop a NN that will take a linear combination of RGB
> values of an image and determine whether or not  the image is an eye
> or not.
>
> For my training data I have divided each object into 4 sections and
> then compute 3 values for each sections giving me a 12 features for my
> input. I then have a list of which images are eyes and compare them to
> what my feature vector is and set my target to 0.9 if the image was an
> eye and 0.1 if the image isn't.
>
> My training data is probably 75% negatives and 25% positives which I
> think may cause a problem.
>
> The structure of my NN is 12 inputs, 18 neurons in a single hidden
> layer and 1 output. I am using the logistic activation function for
> all my neurons.

Would it be possible for you to post your data so that we could have a look at
it?

-- 
Phil Sherrod
(PhilSherrod 'at' comcast.net)
http://www.dtreg.com  (Decision trees, Neural networks and SVM modeling)
http://www.nlreg.com  (Nonlinear Regression)
0
Reply Phil 10/26/2007 2:17:54 AM

> Three values? R, G, B?

The three values are in fact combinations of the RGB values. To be
exact they're the average red value in a region minus the average
green value, the average red value minus the average blue value, and
the absolute difference between the average blue value and the average
green value for a region. I also use another feature that is a
combination of the average R G B values to calculate the redness of
the region.

> And you average over four regions --- hence in effect four pixels?

I divide a region into 4 quadrants and calculate the above values for
each of the quadrants. This in gives me 16 features

> It would be interesting to see the raw images and then the feature data.
>
>  From where did you get the images; what I'm getting at is did you
> capture the images yourself under controlled lighting conditions?

The images are from many different sources. I do some preprocessing on
them to help identify areas within the image that are candidate eyes
however I'm getting a lot of false positives from this step, hence the
attempt to use ANN to complete the classification.

> The first thing I'd try is convert the images to monochrome and somehow
> normalise them to range [0, 1]

I've now done this for each feature vector. I originally normalised
across my entire feature vector set however when I wanted to apply the
ANN it failed because the values I used to normalise each feature (the
mean/standard deviation of a feature for my entire feature vector set)
aren't readily available when I try to classify a single feature
vector. Right now I'm trying to normalise across the feature vector
itself.

> Another bit of exploration would be to cluster the data into five or ten
> classes, and look at the comparability of true class label and cluster
> class label.

A colleague suggested this but my data set is too large (around 30000
feature vectors) to come up with unique classifiers that would allow
me to cluster classes.


> Yes, don't even consider coding an MLP in C++ until you get some hope
> from your Matlab experiments.
>
> And then, don't try to code it, unless you have to.

A colleague of mine also developed a C++ MLP NN but there were problem
domain specific code within the project. I've been hacking away at it
to try and get it to work with my own problem and I've had somewhat
successful results.

Right now it seems normalisation is the issue as well as the features
I have selected so I may have to rethink them.

> Best regards,
>
> Jon C.


0
Reply Chris 10/26/2007 12:54:32 PM

On Oct 25, 10:17 pm, "Phil Sherrod"
<PhilSher...@REMOVETHIScomcast.net> wrote:
> > I'm trying to develop a NN that will take a linear combination of RGB
> > values of an image and determine whether or not  the image is an eye
> > or not.
>
> > For my training data I have divided each object into 4 sections and
> > then compute 3 values for each sections giving me a 12 features for my
> > input. I then have a list of which images are eyes and compare them to
> > what my feature vector is and set my target to 0.9 if the image was an
> > eye and 0.1 if the image isn't.
>
> > My training data is probably 75% negatives and 25% positives which I
> > think may cause a problem.
>
> > The structure of my NN is 12 inputs, 18 neurons in a single hidden
> > layer and 1 output. I am using the logistic activation function for
> > all my neurons.
>
> Would it be possible for you to post your data so that we could have a look at
> it?
>
> --
> Phil Sherrod
> (PhilSherrod 'at' comcast.net)http://www.dtreg.com (Decision trees, Neural networks and SVM modeling)http://www.nlreg.com (Nonlinear Regression)

That's one of the problems! I'm having trouble generating meaningful,
reproducible data due to normalisation issues

0
Reply Chris 10/26/2007 12:57:57 PM

On 26-Oct-2007, Chris <chris.olekas@gmail.com> wrote:

> > Would it be possible for you to post your data so that we could have a look
> > at
> > it?
> >
> That's one of the problems! I'm having trouble generating meaningful,
> reproducible data due to normalisation issues

I think you're going to have to clear that up before you worry about the
details of creating a neural network to model your data.  Once you get
meaningful, reproducible data, set up a link where we can access it, and we'll
try to help you build a model.

-- 
Phil Sherrod
(PhilSherrod 'at' comcast.net)
http://www.dtreg.com  (Decision trees, Neural networks and SVM modeling)
http://www.nlreg.com  (Nonlinear Regression)
0
Reply Phil 10/26/2007 10:33:07 PM

On Oct 26, 5:33 pm, "Phil Sherrod" <PhilSher...@REMOVETHIScomcast.net>
wrote:
> On 26-Oct-2007, Chris <chris.ole...@gmail.com> wrote:
>
> > > Would it be possible for you to post your data so that we could have a look
> > > at
> > > it?
>
> > That's one of the problems! I'm having trouble generating meaningful,
> > reproducible data due to normalisation issues
>
> I think you're going to have to clear that up before you worry about the
> details of creating a neural network to model your data.  Once you get
> meaningful, reproducible data, set up a link where we can access it, and we'll
> try to help you build a model.
>
> --
> Phil Sherrod
> (PhilSherrod 'at' comcast.net)http://www.dtreg.com (Decision trees, Neural networks and SVM modeling)http://www.nlreg.com (Nonlinear Regression)

Hi again,

I've had some success classifying objects through using the C++ NN
package a colleague gave me. However, the results in training and
evaluating do not translate themselves when I try to apply the
weightings.

The full data set that I am training over can be found here
http://www.eng.uwaterloo.ca/~cvolekas/Ann-Data.txt

Basically its a 48 feature vector with the last column indicating the
target value of the row. I have 2244 rows of data, 1144 for target 0
and 1144 for target 1.

The C++ program uses two thresholds for training, 0.9 for 1, and 0.3
for 0. If it is between 0.9 and 0.3 it notes that the output is a low
confidence value. While training, I get around a 94% classification
rate where 6% of the values fall between 0.9 and 0.3.

When I use the weightings generated from training however within my
application I get poorer results. There are roughly ten times more
negatives than positives and the NN can only classify around 60% of
these as negatives and the rest positves.

Any insight would be great

0
Reply Chris 11/8/2007 8:01:24 PM

On Nov 8, 3:01 pm, Chris <chris.ole...@gmail.com> wrote:
> On Oct 26, 5:33 pm, "Phil Sherrod" <PhilSher...@REMOVETHIScomcast.net>
> wrote:
>
>
>
> > On 26-Oct-2007, Chris <chris.ole...@gmail.com> wrote:
>
> > > > Would it be possible for you to post your data so that we could have a look
> > > > at
> > > > it?
>
> > > That's one of the problems! I'm having trouble generating meaningful,
> > > reproducible data due to normalisation issues
>
> > I think you're going to have to clear that up before you worry about the
> > details of creating a neural network to model your data.  Once you get
> > meaningful, reproducible data, set up a link where we can access it, and we'll
> > try to help you build a model.
>
> > --
> > Phil Sherrod
> > (PhilSherrod 'at' comcast.net)http://www.dtreg.com(Decision trees, Neural networks and SVM modeling)http://www.nlreg.com(Nonlinear Regression)
>
> Hi again,
>
> I've had some success classifying objects through using the C++ NN
> package a colleague gave me. However, the results in training and
> evaluating do not translate themselves when I try to apply the
> weightings.
>
> The full data set that I am training over can be found herehttp://www.eng.uwaterloo.ca/~cvolekas/Ann-Data.txt
>
> Basically its a 48 feature vector with the last column indicating the
> target value of the row. I have 2244 rows of data, 1144 for target 0
> and 1144 for target 1.
>
> The C++ program uses two thresholds for training, 0.9 for 1, and 0.3
> for 0. If it is between 0.9 and 0.3 it notes that the output is a low
> confidence value. While training, I get around a 94% classification
> rate where 6% of the values fall between 0.9 and 0.3.
>
> When I use the weightings generated from training however within my
> application I get poorer results. There are roughly ten times more
> negatives than positives and the NN can only classify around 60% of
> these as negatives and the rest positves.
>
> Any insight would be great

Just another note.

I tested my data within Matlab and got 99% classification after 10000
epochs. However, when I converted the weightings gained through the
Matlab netopt command and used them within my C++ program I still
received somewhat lackluster results.

Also, for the C++ NN program, I made some changes such that binary
classification was more valid (originally I messed up and would never
account for features being misclassified, only if they weren't within
my confidence thresholds). Now I'm getting results similar to this
(after 10k epochs)

Correctly classified:
1878
Misclassified:
210
Not within bounds of confidence (output node's value is between 0.9
and 0.3)
156

0
Reply Chris 11/9/2007 12:03:24 AM

On  8-Nov-2007, Chris <chris.olekas@gmail.com> wrote:

> I tested my data within Matlab and got 99% classification after 10000
> epochs.

Is that 99% accuracy on the training data?  Did you do any sort of validation
such as holding out a sample or doing v-fold cross-validation?

How many layers and how many nodes were in the neural network you built?

> Correctly classified:
> 1878
> Misclassified:
> 210
> Not within bounds of confidence (output node's value is between 0.9
> and 0.3)
> 156

For analyses with a continuous target variable such as you have, I'm more
accustomed to measuring accuracy as a proportion (or percent) of variance
explained by the model.  Did you get that figure out of the analysis?

I built a General Regression Neural Network for your data.  It explained 99.9%
of the variance of the training data, but it explained only 73% of the variance
on a 20% hold-out sample.

I'm building an SVM model right now.  I'll post the results later.

-- 
Phil Sherrod
(PhilSherrod 'at' comcast.net)
http://www.dtreg.com  (Decision trees, Neural networks and SVM modeling)
http://www.nlreg.com  (Nonlinear Regression)
0
Reply Phil 11/9/2007 3:01:15 AM

On Nov 8, 10:01 pm, "Phil Sherrod" <PhilSher...@REMOVETHIScomcast.net>
wrote:
> On  8-Nov-2007, Chris <chris.ole...@gmail.com> wrote:
>
> > I tested my data within Matlab and got 99% classification after 10000
> > epochs.
>
> Is that 99% accuracy on the training data?  Did you do any sort of validation
> such as holding out a sample or doing v-fold cross-validation?
>

I admit that I haven't done much validation with holding out a sample
or v-fold cross-validation. I was worried that the amount of data I
had wasn't enough for training so I tried to refrain from leaving out
some data. The application that requires the neural net for
classification is what I'm using for validation basically, and this is
where its doing poorly. The application is in its control phase as we
validate how effective the NN is.

> How many layers and how many nodes were in the neural network you built?
>
Currently I'm using a single layer with 37 hidden nodes.

> For analyses with a continuous target variable such as you have, I'm more
> accustomed to measuring accuracy as a proportion (or percent) of variance
> explained by the model.  Did you get that figure out of the analysis?
>
> I built a General Regression Neural Network for your data.  It explained 99.9%
> of the variance of the training data, but it explained only 73% of the variance
> on a 20% hold-out sample.
>
> I'm building an SVM model right now.  I'll post the results later.
>
> --
> Phil Sherrod
> (PhilSherrod 'at' comcast.net)http://www.dtreg.com (Decision trees, Neural networks and SVM modeling)http://www.nlreg.com (Nonlinear Regression)


0
Reply Chris 11/9/2007 2:50:15 PM

On Nov 9, 9:50 am, Chris <chris.ole...@gmail.com> wrote:
> On Nov 8, 10:01 pm, "Phil Sherrod" <PhilSher...@REMOVETHIScomcast.net>
> wrote:
>
> > On  8-Nov-2007, Chris <chris.ole...@gmail.com> wrote:
>
> > > I tested my data within Matlab and got 99% classification after 10000
> > > epochs.
>
> > Is that 99% accuracy on the training data?  Did you do any sort of validation
> > such as holding out a sample or doing v-fold cross-validation?
>
> I admit that I haven't done much validation with holding out a sample
> or v-fold cross-validation. I was worried that the amount of data I
> had wasn't enough for training so I tried to refrain from leaving out
> some data. The application that requires the neural net for
> classification is what I'm using for validation basically, and this is
> where its doing poorly. The application is in its control phase as we
> validate how effective the NN is.

I would also like to add that the data I'm validating on is not
identical to what I'm training on. There are chances that a feature
I'm evaluating will be very close to a feature I trained on however
for the most part the features only have a little in common and from
looking at the validation data almost none are identical.

0
Reply Chris 11/9/2007 3:11:26 PM

On 9 Nov, 15:11, Chris <chris.ole...@gmail.com> wrote:
> On Nov 9, 9:50 am, Chris <chris.ole...@gmail.com> wrote:
>
>
>
> > On Nov 8, 10:01 pm, "Phil Sherrod" <PhilSher...@REMOVETHIScomcast.net>
> > wrote:
>
> > > On  8-Nov-2007, Chris <chris.ole...@gmail.com> wrote:
>
> > > > I tested my data within Matlab and got 99% classification after 10000
> > > > epochs.
>
> > > Is that 99% accuracy on the training data?  Did you do any sort of validation
> > > such as holding out a sample or doing v-fold cross-validation?
>
> > I admit that I haven't done much validation with holding out a sample
> > or v-fold cross-validation. I was worried that the amount of data I
> > had wasn't enough for training so I tried to refrain from leaving out
> > some data. The application that requires the neural net for
> > classification is what I'm using for validation basically, and this is
> > where its doing poorly. The application is in its control phase as we
> > validate how effective the NN is.

I have had a look at your training data using kernel logistic
regression (basically a sophisticated sort of RBF network) using my
Generalised Kernel Machine MATLAB toolbox (you can download it from
http://theoval.cmp.uea.ac.uk/~gcc/projects/gkm/).  It gives a 10-fold
cross-validation error of 8.82%.

In general, never pay attention to training set performance as an
indication of generalisation performance.  If you are worried about
the lack of data, perform 10-fold cross-validation to get a
performance estimate and then train the final model on the entire
dataset.

BTW my toolbox has primitives for automated model selection and cross-
validation, so you can just leave it running overnight (as I did)
without really putting in any effort yourself.

If the relative class frequencies in the training data are
unrepresentative of operational conditions you can generally
compensate by changing the threshold.  This is easiest if you have a
classifier that outputs an estimate of probability, such as KLR.
There is a good paper on this by Lowe and Webb,

@article{Lowe1990,
   author       = "Lowe, D. and Webb, A. R.",
   title        = "Exploiting prior knowledge in network optimization:
an
                   illustration from medical prognosis",
   journal      = "Network: Computation in Neural Systems",
   volume       = 1,
   number       = 3,
   pages        = "299--323",
   year         = 1990
}

0
Reply Gavin 11/10/2007 10:35:52 AM

On Nov 10, 5:35 am, Gavin Cawley <GavinCaw...@googlemail.com> wrote:
> On 9 Nov, 15:11, Chris <chris.ole...@gmail.com> wrote:
>
> > On Nov 9, 9:50 am, Chris <chris.ole...@gmail.com> wrote:
>
> > > On Nov 8, 10:01 pm, "Phil Sherrod" <PhilSher...@REMOVETHIScomcast.net>
> > > wrote:
>
> > > > On  8-Nov-2007, Chris <chris.ole...@gmail.com> wrote:
>
> > > > > I tested my data within Matlab and got 99% classification after 10000
> > > > > epochs.
>
> > > > Is that 99% accuracy on the training data?  Did you do any sort of validation
> > > > such as holding out a sample or doing v-fold cross-validation?
>
> > > I admit that I haven't done much validation with holding out a sample
> > > or v-fold cross-validation. I was worried that the amount of data I
> > > had wasn't enough for training so I tried to refrain from leaving out
> > > some data. The application that requires the neural net for
> > > classification is what I'm using for validation basically, and this is
> > > where its doing poorly. The application is in its control phase as we
> > > validate how effective the NN is.
>
> I have had a look at your training data using kernel logistic
> regression (basically a sophisticated sort of RBF network) using my
> Generalised Kernel Machine MATLAB toolbox (you can download it fromhttp://theoval.cmp.uea.ac.uk/~gcc/projects/gkm/).  It gives a 10-fold
> cross-validation error of 8.82%.
>
> In general, never pay attention to training set performance as an
> indication of generalisation performance.  If you are worried about
> the lack of data, perform 10-fold cross-validation to get a
> performance estimate and then train the final model on the entire
> dataset.
>
> BTW my toolbox has primitives for automated model selection and cross-
> validation, so you can just leave it running overnight (as I did)
> without really putting in any effort yourself.
>
> If the relative class frequencies in the training data are
> unrepresentative of operational conditions you can generally
> compensate by changing the threshold.  This is easiest if you have a
> classifier that outputs an estimate of probability, such as KLR.
> There is a good paper on this by Lowe and Webb,
>
> @article{Lowe1990,
>    author       = "Lowe, D. and Webb, A. R.",
>    title        = "Exploiting prior knowledge in network optimization:
> an
>                    illustration from medical prognosis",
>    journal      = "Network: Computation in Neural Systems",
>    volume       = 1,
>    number       = 3,
>    pages        = "299--323",
>    year         = 1990

If the relative class frequencies in the training data are both
unbalanced ( say fi > 2*fj for class i and j) and unrepresentative
of operational conditions, compensate for the unbalancing first.
Several techniques, assuming posterior probability outputs with
unipolar binary targets, are discussed in the archives (e.g., search
on greg-heath unbalanced ). They involve nonuniform presentation
frequencies and/or weighted objective functions to simulate
unweighted training with equal training class size.  The resulting
posteriors should then be rescaled, post-training, to obtain the
specified the operational prior. Then threshold compensation
should not be necessary if misclassification costs are equal.

Similar techniques can be used if class-conditional misclassification
costs are specified and the problem becomes one of minimizing risk
(sum of products of probability density, operational priors and
misclassification costs). Details of classification risk minimization
are covered in pattern recognition texts (e.g., Duda et al, Neilsson,
Fukunaga, Devijver & Kittler,etc) as well as the familiar NN texts
(e.g., Bishop, Ripley). It may help to use archive search keyword
strings that include greg-heath risk cost.

Hope this helps.

Greg

0
Reply Greg 11/10/2007 9:46:47 PM

On 10-Nov-2007, Gavin Cawley <GavinCawley@googlemail.com> wrote:

> I have had a look at your training data using kernel logistic
> regression (basically a sophisticated sort of RBF network) using my
> Generalised Kernel Machine MATLAB toolbox (you can download it from
> http://theoval.cmp.uea.ac.uk/~gcc/projects/gkm/).  It gives a 10-fold
> cross-validation error of 8.82%.

I'm interested in your kernel logistic regression procedure.  Where can I get a
copy of your paper describing the algorithm?

-- 
Phil Sherrod
(PhilSherrod 'at' comcast.net)
http://www.dtreg.com  (Decision trees, Neural networks and SVM modeling)
http://www.nlreg.com  (Nonlinear Regression)
0
Reply Phil 11/11/2007 2:03:30 PM

On 11 Nov, 14:03, "Phil Sherrod" <PhilSher...@REMOVETHIScomcast.net>
wrote:
> On 10-Nov-2007, Gavin Cawley <GavinCaw...@googlemail.com> wrote:
>
> > I have had a look at your training data using kernel logistic
> > regression (basically a sophisticated sort of RBF network) using my
> > Generalised Kernel Machine MATLAB toolbox (you can download it from
> >http://theoval.cmp.uea.ac.uk/~gcc/projects/gkm/).  It gives a 10-fold
> > cross-validation error of 8.82%.
>
> I'm interested in your kernel logistic regression procedure.  Where can I get a
> copy of your paper describing the algorithm?

I've added a pre-print of the paper on the project web page.  I am
working on some proper documentation for the toolbox, for the moment
there is only the demo scripts that reproduce the illustrative results
given in the paper.  Let me know if you have any questions (via my
work email address given on the paper).

0
Reply Gavin 11/11/2007 2:23:09 PM

On 11-Nov-2007, Gavin Cawley <GavinCawley@googlemail.com> wrote:

> > > I have had a look at your training data using kernel logistic
> > > regression (basically a sophisticated sort of RBF network) using my
> > > Generalised Kernel Machine MATLAB toolbox (you can download it from
> > >http://theoval.cmp.uea.ac.uk/~gcc/projects/gkm/).  It gives a 10-fold
> > > cross-validation error of 8.82%.
> >
> > I'm interested in your kernel logistic regression procedure.  Where can I
> > get a
> > copy of your paper describing the algorithm?
>
> I've added a pre-print of the paper on the project web page.  I am
> working on some proper documentation for the toolbox, for the moment
> there is only the demo scripts that reproduce the illustrative results
> given in the paper.  Let me know if you have any questions (via my
> work email address given on the paper).

I've got it.  It took me a couple of minutes to notice the "pdf" link for the
paper. You might want to make the link a little more obvious.

Thank you for posting the paper.

-- 
Phil Sherrod
(PhilSherrod 'at' comcast.net)
http://www.dtreg.com  (Decision trees, Neural networks and SVM modeling)
http://www.nlreg.com  (Nonlinear Regression)
0
Reply Phil 11/11/2007 7:21:46 PM

On Nov 8, 3:01 pm, Chris <chris.ole...@gmail.com> wrote:
> On Oct 26, 5:33 pm, "Phil Sherrod" <PhilSher...@REMOVETHIScomcast.net>
> wrote:
>
> > On 26-Oct-2007, Chris <chris.ole...@gmail.com> wrote:
>
> > > > Would it be possible for you to post your data so that we could
> > > > have a look at it?
>
> > > That's one of the problems! I'm having trouble generating meaningful,
> > > reproducible data due to normalisation issues
>
> > I think you're going to have to clear that up before you worry about the
> > details of creating a neural network to model your data.  Once you get
> > meaningful, reproducible data, set up a link where we can access it, and
> > we'll try to help you build a model.
>
> I've had some success classifying objects through using the C++ NN
> package a colleague gave me. However, the results in training and
> evaluating do not translate themselves when I try to apply the
> weightings.
>
> The full data set that I am training over can be found here
>
> http://www.eng.uwaterloo.ca/~cvolekas/Ann-Data.txt

I'm unable to reach both

http://www.eng.uwaterloo.ca/~cvolekas
and
http://www.eng.uwaterloo.ca/~cvolekas/Ann-Data.txt

> Basically its a 48 feature vector with the last column indicating the
> target value of the row. I have 2244 rows of data, 1144 for target 0
> and 1144 for target 1.

2288?

> The C++ program uses two thresholds for training, 0.9 for 1, and 0.3
> for 0. If it is between 0.9 and 0.3 it notes that the output is a low
> confidence value. While training, I get around a 94% classification
> rate where 6% of the values fall between 0.9 and 0.3.

Strange way to train.

Why not just use {0,1}  targets and tabulate error rates
( class-conditional and mixture) as a function of  a
classification threshold?

> When I use the weightings generated from training however within my
> application I get poorer results. There are roughly ten times more
> negatives than positives and the NN can only classify around 60% of
> these as negatives and the rest positves.
>
> Any insight would be great

The training set is balanced which is good. However, I
don't know what training algorithm and stopping rule you
used.

The output needs to be rescaled w.r.t. the difference
between balanced training (training targets {0,1} with
training priors Ptrn0 = Ptrn1) and unbalanced testing
(testing priors Ptst0 = 10*Ptst1). If you had two unity
sum outputs with {0,1} targets, the rescaled outputs
are obtained as follows

a0 = Ptst0/Ptrn0,   a1 = Ptst1/Ptrn1

out0new  =  a0*out0old / (a0*out0old+a1*out1old)
out1new  =  a1*out1old / (a0*out0old+a1*out1old)

Therefore, using only one output,

out1new  =  a1*out1old / (a0*(1-out1old)+a1*out1old)
               =  a1*out1old / (a0+(a1-a0)*out1old)

with the assumption out0new = 1-out1new.

For the current scenario, a0 = 10*a1. Therefore,

out1new  =  out1old / (10-9*out1old)

Finally, if misclassification costs are not equal, a
classification threshold can be offset from 1/2 to
minimize classification risk  of an independent
validation set.

Note that misclassification costs could have been
used in the rescaling equations, however, I chose
to keep it simple.

Hope this helps.

Greg

0
Reply Greg 11/12/2007 2:42:50 AM

On Nov 8, 7:03 pm, Chris <chris.ole...@gmail.com> wrote:
> On Nov 8, 3:01 pm, Chris <chris.ole...@gmail.com> wrote:
> > On Oct 26, 5:33 pm, "Phil Sherrod" <PhilSher...@REMOVETHIScomcast.net>
> > wrote:
>
> > > On 26-Oct-2007, Chris <chris.ole...@gmail.com> wrote:
>
> > > > > Would it be possible for you to post your data so that
> > > > > we could have a look at it?
>
> > > > That's one of the problems! I'm having trouble generating
> > > > meaningful, reproducible data due to normalisation issues
>
> > > I think you're going to have to clear that up before you worry about the
> > > details of creating a neural network to model your data.  Once you get
> > > meaningful, reproducible data, set up a link where we can access it, > > > and we'll  try to help you build a model.

> > I've had some success classifying objects through using the C++ NN
> > package a colleague gave me. However, the results in training and
> > evaluating do not translate themselves when I try to apply the
> > weightings.
>
> > The full data set that I am training over can be found herehttp://www.eng.uwaterloo.ca/~cvolekas/Ann-Data.txt
>
> > Basically its a 48 feature vector with the last column indicating the
> > target value of the row. I have 2244 rows of data, 1144 for target 0
> > and 1144 for target 1.
>
> > The C++ program uses two thresholds for training, 0.9 for 1, and 0.3
> > for 0. If it is between 0.9 and 0.3 it notes that the output is a low
> > confidence value. While training, I get around a 94% classification
> > rate where 6% of the values fall between 0.9 and 0.3.
>
> > When I use the weightings generated from training however within my
> > application I get poorer results. There are roughly ten times more
> > negatives than positives and the NN can only classify around 60% of
> > these as negatives and the rest positves.
>
> > Any insight would be great
>
> Just another note.
>
> I tested my data within Matlab and got 99% classification after 10000
> epochs. However, when I converted the weightings gained through the
> Matlab netopt command and used them within my C++ program I still
> received somewhat lackluster results.
>
> Also, for the C++ NN program, I made some changes such that binary
> classification was more valid (originally I messed up and would never
> account for features being misclassified, only if they weren't within
> my confidence thresholds). Now I'm getting results similar to this
> (after 10k epochs)
>
> Correctly classified:
> 1878
> Misclassified:
> 210
> Not within bounds of confidence (output node's value is between 0.9
> and 0.3)
> 156

I don't know where this {0.9,0.1} target / {0.9,0.3} confidence bull
comes from.

1.Train using {0,1} targets.
2.Use a validation set to obtain piecewise constant
class-conditional output histograms
3. Integrate the histograms (forward for class 1, backward for
class0) to obtain the corresponding piecewise linear
class-conditional cumulative probability distributions (S-shaped
for class 1 , reverse S-shaped for class 0).
4. The curves can then be interpreted as error rate vs threshold:
Choose a threshold and read off the error rates from the curves.
5. Choose acceptable error rates for each class and obtain
the corresponding thresholds.
6. If T1 >= T0, the classification rule is

if out >= T1 then assign x to class 1
else assign x to class 0

However, if T1 < T0, the classification rule is

If out > T0 then assign x to class 1
elseif out < T1 then assign x to class 0
else assign x to the low confidence non classification category.

Hope this helps.

Greg

0
Reply Greg 11/12/2007 3:14:45 AM

On Nov 9, 9:50 am, Chris <chris.ole...@gmail.com> wrote:
> On Nov 8, 10:01 pm, "Phil Sherrod" <PhilSher...@REMOVETHIScomcast.net>
> wrote:
>
> > On  8-Nov-2007, Chris <chris.ole...@gmail.com> wrote:
>
> > > I tested my data within Matlab and got 99% classification after 10000
> > > epochs.
>
> > Is that 99% accuracy on the training data?  Did you do any sort of validation
> > such as holding out a sample or doing v-fold cross-validation?
>
> I admit that I haven't done much validation with holding out a sample
> or v-fold cross-validation. I was worried that the amount of data I
> had wasn't enough for training so I tried to refrain from leaving out
> some data.

After obtaining a 1% training error rate, you should have no
fear of that.

> The application that requires the neural net for
> classification is what I'm using for validation basically, and this is
> where its doing poorly. The application is in its control phase as we
> validate how effective the NN is.

Better to get to the heart of the matter and use holdout or cross
validation on the NN outside of the application.

> > How many layers and how many nodes were in the neural
> > network you built?
>
> Currently I'm using a single layer with 37 hidden nodes.

How did you arrive at that value?

How many input nodes?

> > For analyses with a continuous target variable such as you have, I'm more
> > accustomed to measuring accuracy as a proportion (or percent) of variance
> > explained by the model.  Did you get that figure out of the analysis?
>
> > I built a General Regression Neural Network for your data.  It explained
> > 99.9% of the variance of the training data, but it explained only 73% of
> > the variance on a 20% hold-out sample.

Another Heath pedanticism:

R^2 doesn't "explain" anything.

Hope this helps.

Greg

0
Reply Greg 11/12/2007 10:46:50 AM

On Nov 9, 10:11 am, Chris <chris.ole...@gmail.com> wrote:
> On Nov 9, 9:50 am, Chris <chris.ole...@gmail.com> wrote:
> > On Nov 8, 10:01 pm, "Phil Sherrod" <PhilSher...@REMOVETHIScomcast.net>
> > wrote:
>
> > > On  8-Nov-2007, Chris <chris.ole...@gmail.com> wrote:
>
> > > > I tested my data within Matlab and got 99% classification
> > > > after 10000 epochs.
>
> > > Is that 99% accuracy on the training data?  Did you do any
> > > sort of validation such as holding out a sample or doing
> > > v-fold cross-validation?
>
> > I admit that I haven't done much validation with holding out a sample
> > or v-fold cross-validation. I was worried that the amount of data I
> > had wasn't enough for training so I tried to refrain from leaving out
> > some data. The application that requires the neural net for
> > classification is what I'm using for validation basically, and this is
> > where its doing poorly. The application is in its control phase as we
> > validate how effective the NN is.
>
> I would also like to add that the data I'm validating on is not
> identical to what I'm training on. There are chances that a feature
> I'm evaluating will be very close to a feature I trained on however
> for the most part the features only have a little in common and from
> looking at the validation data almost none are identical

Theory:

Design (Training+ Validation) and Test data are all assumed
to be random draws from the same probability distribution.

Practice:

If you make the assumption that the Design data adequately
models the salient features of the operational data, then you
can correct for the above problem by taking random draws
from the Design set to obtain the Training and Validation
sets.

If your Test or operational data doesn't appear to have the
same salient features ... and ... a sample is not available for
design, then you have to try to modify the Design data to better
represent the operational data.

Hope this helps.

Greg

0
Reply Greg 11/12/2007 11:02:47 AM

On Nov 11, 10:14 pm, Greg Heath <he...@alumni.brown.edu> wrote:
> On Nov 8, 7:03 pm, Chris <chris.ole...@gmail.com> wrote:
> > On Nov 8, 3:01 pm, Chris <chris.ole...@gmail.com> wrote:
> > > On Oct 26, 5:33 pm, "Phil Sherrod" <PhilSher...@REMOVETHIScomcast.net>
> > > wrote:
>
> > > > On 26-Oct-2007, Chris <chris.ole...@gmail.com> wrote:
>
> > > > > > Would it be possible for you to post your data so that
> > > > > > we could have a look at it?
>
> > > > > That's one of the problems! I'm having trouble generating
> > > > > meaningful, reproducible data due to normalisation issues
>
> > > > I think you're going to have to clear that up before you worry
> > > > about the details of creating a neural network to model your
> > > > data.  Once you get
> > > > meaningful, reproducible data, set up a link where we can access it,
> > > and we'll  try to help you build a model.
> > > I've had some success classifying objects through using the C++ NN
> > > package a colleague gave me. However, the results in training and
> > > evaluating do not translate themselves when I try to apply the
> > > weightings.
>
> > > The full data set that I am training over can be found here
> > > http://www.eng.uwaterloo.ca/~cvolekas/Ann-Data.txt
>
> > > Basically its a 48 feature vector with the last column indicating the
> > > target value of the row. I have 2244 rows of data, 1144 for target 0
> > > and 1144 for target 1.
>
> > > The C++ program uses two thresholds for training, 0.9 for 1, and 0.3
> > > for 0. If it is between 0.9 and 0.3 it notes that the output is a low
> > > confidence value. While training, I get around a 94% classification
> > > rate where 6% of the values fall between 0.9 and 0.3.
>
> > > When I use the weightings generated from training however within my
> > > application I get poorer results. There are roughly ten times more
> > > negatives than positives and the NN can only classify around 60% of
> > > these as negatives and the rest positves.
>
> > > Any insight would be great
>
> > Just another note.
>
> > I tested my data within Matlab and got 99% classification after 10000
> > epochs. However, when I converted the weightings gained through the
> > Matlab netopt command and used them within my C++ program I still
> > received somewhat lackluster results.
>
> > Also, for the C++ NN program, I made some changes such that binary
> > classification was more valid (originally I messed up and would never
> > account for features being misclassified, only if they weren't within
> > my confidence thresholds). Now I'm getting results similar to this
> > (after 10k epochs)
>
> > Correctly classified:
> > 1878
> > Misclassified:
> > 210
> > Not within bounds of confidence (output node's value is between 0.9
> > and 0.3)
> > 156
>
> I don't know where this {0.9,0.1} target / {0.9,0.3} confidence bull
> comes from.
>
> 1.Train using {0,1} targets.
> 2.Use a validation set to obtain piecewise constant
> class-conditional output histograms
> 3. Integrate the histograms (forward for class 1, backward for
> class0) to obtain the corresponding piecewise linear
> class-conditional cumulative probability distributions (S-shaped
> for class 1 , reverse S-shaped for class 0).
> 4. The curves can then be interpreted as error rate vs threshold:
> Choose a threshold and read off the error rates from the curves.
> 5. Choose acceptable error rates for each class and obtain
> the corresponding thresholds.
> 6. If T1 >= T0, the classification rule is
>
> if out >= T1 then assign x to class 1
> else assign x to class 0

Obviously this results in the specied e1, but an e0 that is lower
than specified. It is also obvious that any single threshold

T0 < T <= T1

(e.g., a more balanced T = (T1+T0)/2 ) can be used. The advice
I gave in the previous post is equivalent to using T = T1. It's
OK; however, an intermediate value makes more sense.

> However, if T1 < T0, the classification rule is
>
> If out > T0 then assign x to class 1
> elseif out < T1 then assign x to class 0
> else assign x to the low confidence non classification category.

Hope this helps.

Greg

0
Reply Greg 11/12/2007 11:26:40 AM

23 Replies
182 Views

(page loaded in 0.24 seconds)

Similiar Articles:







7/26/2012 12:35:54 PM


Reply: