This is a multi-part message in MIME format.
--------------090606080908090700020401
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
For my course, I am asked to write a NN program.
For now, I just want it to be able to be able to convert small letters
into capitals.
My teacher provided me veny few notes, and too many points are fuzzy to me.
Of course, it does not work :
I use two layers of 32 cells, with a learning rate of 0.001, and do
10000 cycles at learning stage.
Are values for learning rate and learning stage ok ?
when inputing the data in the first layer, should I the same data into
all 32 cells, or should I quantize the input into values, for example
converting the data into 32bit data, and put one bit in each cell ?
Actually, I input characters, which are 8bit datas ... if I have to give
only one bit per cell, shall I reduce the number of cells to 8, or shall
I give the same bit to several cells ?
for the connection between layers 1 and 2, should the data be
transmitted as floats ( floating point data), integers, or boleans( only
values of 0 or 1) ? in which range of numbers ?
The text of the exercise in licensed by the university, thus I may not
past it in here. It just gives some formulas about the activation unit,
the error rate calculation, and the correction of error during the
learning stage.
for those who are ready to help more, I attach the source code I wrote
during the tutarial, within 2h30 ... I expect it to have many many
mistakes about NN :( please dont shout at me :/
I am sorry for the flat variable space, but to write code fast, I use
_short_ names ...
Thank for any help.
- --
DEMAINE Beno�t-Pierre http:/www.demaine.info/
\_o< apt-get remove ispell >o_/
There're 10 types of people: those who can count in binary and those who
can't
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFBiByRGWSTLbOSw8IRAhK9AKCRUsQ2HlSfeHhV6cibnKWf58CWQwCggdm9
HhgYyz6UuNp24jXjOfZEdn8=
=2Vsb
-----END PGP SIGNATURE-----
--------------090606080908090700020401
Content-Type: text/x-c;
name="plop.c"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="plop.c"
#include <stdlib.h>
//#include "stdafx.h"
//#using <mscorlib.dll>
#include <stdio.h>
#include <math.h>
//using namespace System;
////////////////////////////////////////////////////////////////////////////
#define LEARNING_STEPS 10000 // should be NB_CELL_P_L*NB_CELL_P_L*NB_LAYER
#define LERNING_RATE 0.001 // how much should it be ?
#define NB_LAYER 2 // nb of layers
#define NB_CELL_P_L 32 // nb of cells per layer
#define ALPHA 1
////////////////////////////////////////////////////////////////////////////
float coef[NB_LAYER][NB_CELL_P_L][NB_CELL_P_L]; // from 0 to NB_LAYER-1
// to be read as :
// ceof[ref of layer of cell][ref of cell in the layer][ref of input cell for current cell]
// should the output be floats or bits ?
int out[NB_LAYER+1][NB_CELL_P_L]; // output of layers from 1 to NB_LAYER
// out[0] is the input of input layer
// out[NB_LAYER] will be the output of output layer
int cell(int layer,int cell) // layer form 0 to NB_LAYER-1
// apply sigmoide func to a cell
{
float sum, output;
int i;
sum=0;
for(i=0;i<NB_CELL_P_L;i++)
sum+=out[layer][i]*coef[layer][cell][i]; // read inputs, mult by coefs, and summ it all
output=1/(1+expf(-1*ALPHA*sum)); // apply the sigmoide function
if(output<0.5)
out[layer+1][cell]=0; // store result in out[]
else // should this be int/bits or floats ?
out[layer+1][cell]=1;
return(0);
}
// once the input are valid (out[0]), propagate info acroos the NN, in order to get the output.
int propagate()
{
int i,j;
for(i=0;i<NB_LAYER;i++) // for each layer
for(j=0;j<NB_CELL_P_L;j++) // for each cell
cell(i,j);
return(0);
}
int init()
{
int i,j,k;
float f;
// putting random coefs as default
for(i=0;i<NB_LAYER;i++)
for(j=0;j<NB_CELL_P_L;j++)
for(k=0;k<NB_CELL_P_L;k++)
{
f=(float)rand();
f=f/(RAND_MAX/2);
f-=1;
// printf("%f\n\r",f);
coef[i][j][k]=f;
}
return(0);
}
int learning()
{
int l,i,k;
float o;
char *inc;
char *outc;
inc="abcdefghijklmnopqrstuvwxyz";
outc="ABCDEFGHIJKLMNOPQRSTUVWXYZ";
for(l=0;l<LEARNING_STEPS;l++)
{
int r;
float f,sum;
float tin, target;
f=(float)rand();
f=f*26/RAND_MAX;
tin=inc[(int)f];
target=outc[(int)f];
for(r=0;r<NB_CELL_P_L;r++)
// I dont know whether to put a float per cell or just a bit per cell ?
out[0][r]=(int)tin;
// compute output from input
propagate();
// correct the coefs
for(i=(NB_LAYER-1);i>=0;i--) // foreach layer ( from output to input)
for(r=0;r<NB_CELL_P_L;r++) // foreach cell
{
for(k=0;k<NB_CELL_P_L;k++) // foreach link
{
float t;
// correct the coef of the link
t=(float)out[i+1][r];
t=t*(1-t)*(target-t);
if(i==(NB_LAYER-1))
{
o=t;
}
else
{
int z;
float m;
m=0;
for(z=0;z<NB_CELL_P_L;z++)
{
m+=t*coef[i+1][k][r];
}
o=(float)out[i+1][r];
o=o*(1-o)*(m);
}
coef[i][r][k]=coef[i][r][k]+((float)LERNING_RATE*o*out[i][k]);
}// end of 3rd for
}// end of 2nd for
// end of 1st for
// compute error rate
sum=0;
for(r=0;r<NB_CELL_P_L;r++)
{
sum+=powf((target-(float)out[NB_LAYER][r]),2);
}
sum=sqrt(sum)/NB_CELL_P_L;
if(!(l%100))
printf("The error at stage %i is %f.\n\r",l,sum);
}
return(0);
}
/*
int main()
{
return(0);
}
*/
//int _tmain()
int main()
{
printf("A float if %i bytes\n\r",sizeof(float));
printf("Initialising coefs ...\n\r");
init();
printf("Learning from DB ...\n\r");
learning();
printf("Processing user data (TO DO)\n\r");
printf("Done\n\r");
#ifdef __WIN32__
printf("Press a key\n\r");
getchar();
#endif
return 0;
}
--------------090606080908090700020401--
|
|
0
|
|
|
|
Reply
|
nntp_pipex (70)
|
11/2/2004 11:47:32 PM |
|
This is a multi-part message in MIME format.
--------------000404010309090308010308
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Since nobody loves me ... I reply to myself :
My teacher gave me a magic value for the learning rate : 0.02
for the number of steps, it is usually 3 to 5 times the number of links.
In my case, the good idear would be to quantize the input data into
bits; since I am using chars, I splited them into bunch of 8 bits; thus,
the first layer should have 8 cells. Since I want to manage all
characters, the hidden layer and output layers should have 128 cells each.
In practice, for my exercise, to make learning fast, I limitted the app
to 40 chars; and since my code put the same number of cells in each
layer, I have 3*40 cells. then 3000 cycles are enough for teaching.
About out[][] : it shall not be int, but floats: since I am using the
sigmoide funct in cells, I shall keep the output of a cell as a float
number, and threashold it ONLY when converting output back to a character.
***
The activation function is the sigmoid function:
f(input)=1/(1+e^(- alpha * input))
where alpha is the smooth coef; for normal sigmoid, choose alpha=1;
smaller value will introduce fuzziness in the results, what is usefull
when you want to admitt possible error in the output ( ie, when you
expect the teaching not to be 100% accurate).
***
you find attached the code as it work not that bad tonight.
- --
DEMAINE Beno�t-Pierre http:/www.demaine.info/
\_o< apt-get remove ispell >o_/
There're 10 types of people: those who can count in binary and those who
can't
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFBiuiDGWSTLbOSw8IRAk+LAKCsPj7P5I2o6Gow4TeTB6Y3yKsrDgCff7bM
QgJICryD5yH7nD5yAHXMvPg=
=8T6w
-----END PGP SIGNATURE-----
--------------000404010309090308010308
Content-Type: text/x-c;
name="plop.c"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="plop.c"
#include <stdlib.h>
#ifdef __WIN32__
#include "stdafx.h"
#using <mscorlib.dll>
#endif
#include <stdio.h>
#include <math.h>
//using namespace System;
////////////////////////////////////////////////////////////////////////////
#define LEARNING_STEPS 3000 // should be NB_CELL_P_L*NB_CELL_P_L*NB_LAYER
#define LERNING_RATE 0.08 // how much should it be ?
#define NB_LAYER 2 // nb of layers
#define NB_CELL_P_L 40 // nb of cells per layer
#define ALPHA 1
////////////////////////////////////////////////////////////////////////////
float coef[NB_LAYER][NB_CELL_P_L][NB_CELL_P_L]; // from 0 to NB_LAYER-1
// to be read as :
// ceof[ref of layer of cell][ref of cell in the layer][ref of input cell for current cell]
float out[NB_LAYER+1][NB_CELL_P_L]; // output of layers from 1 to NB_LAYER
// out[0] is the input of input layer
// out[NB_LAYER] will be the output of output layer
int cell(int layer,int cell) // layer form 0 to NB_LAYER-1
// apply sigmoide func to a cell
{
float sum, output;
int i;
sum=0;
for(i=0;i<NB_CELL_P_L;i++)
sum+=out[layer][i]*coef[layer][cell][i];
// read inputs, mult by coefs, and summ it all
output=1/(1+expf(-1*ALPHA*sum)); // apply the sigmoide function
/* if(output<0.5)
out[layer+1][cell]=0; // store result in out[]
else
out[layer+1][cell]=1;
*/
out[layer+1][cell]=output;
return(0);
}
// once the input are valid (out[0]),
// propagate info acroos the NN, in order to get the output.
int propagate()
{
int i,j;
for(i=0;i<NB_LAYER;i++) // for each layer
for(j=0;j<NB_CELL_P_L;j++) // for each cell
cell(i,j);
return(0);
}
int init()
{
int i,j,k;
float f;
// putting random coefs as default
for(i=0;i<NB_LAYER;i++)
for(j=0;j<NB_CELL_P_L;j++)
for(k=0;k<NB_CELL_P_L;k++)
{
f=(float)rand();
f=f/(RAND_MAX/2);
f-=1;
coef[i][j][k]=f;
}
return(0);
}
int learning()
{
int l,i,k;
float o;
char *inc;
char *outc;
inc="abcdefghijklmnopqrstuvwxyz .,;()-+*=?!";
outc="ABCDEFGHIJKLMNOPQRSTUVWXYZ .,;()-+*=?!";
// inc="abcdefghijklmnopqrstuvwxyz 1234567890.,;()_-+*/=?<>#$%&!@[]{}";
// outc="ABCDEFGHIJKLMNOPQRSTUVWXYZ 1234567890.,;()_-+*/=?<>#$%&!@[]{}";
if(strlen(inc)!=strlen(outc))
{
printf("Learning chains dont have same size. Please make them identical.\n\r");
exit(1);
}
if(strlen(inc)>NB_CELL_P_L)
{
printf("The base strings are longer than the number of cells. The network CAN NOT be accurate. Please short the strings, or put more cells.\n\r");
exit(1);
}
for(l=0;l<LEARNING_STEPS;l++)
{
int r;
float f,sum;
int tin, tout;
int target[NB_CELL_P_L];
f=(float)rand();
f=f*strlen(inc)/RAND_MAX;
tin=inc[(int)f];
tout=outc[(int)f];
for(r=0;r<NB_CELL_P_L;r++)
{
out[0][r]=(((int)tin)>>r) & 0x01;
target[r]=(((int)tout)>>(r%8)) & 0x01;
}
// compute output from input
propagate();
// correct the coefs
for(i=(NB_LAYER-1);i>=0;i--)
// foreach layer ( from output to input)
for(r=0;r<NB_CELL_P_L;r++) // foreach cell
{
for(k=0;k<NB_CELL_P_L;k++) // foreach link
{
float t;
// correct the coef of the link
t=(float)out[i+1][r];
t=t*(1-t)*(target[r]-t);
// use an equation to correct weight of output layer ...
if(i==(NB_LAYER-1))
{
o=t;
}
else
// and an other one for under layers ...
{
int z;
float m;
m=0;
for(z=0;z<NB_CELL_P_L;z++)
{
m+=t*coef[i+1][z][r];
}
o=(float)out[i+1][r];
o=o*(1-o)*(m);
}
coef[i][r][k]=coef[i][r][k]+((float)LERNING_RATE*o*out[i][k]);
}// end of 3rd for
}// end of 2nd for
// end of 1st for
// compute error rate
sum=0;
for(r=0;r<NB_CELL_P_L;r++)
{
sum+=powf((target[r]-(float)out[NB_LAYER][r]),2);
}
sum=sqrt(sum)/NB_CELL_P_L;
if(!(l%100))
printf("(%c_%c)The error at stage %6.6i is %f.\n\r",tin,tout,l,sum);
}
return(0);
}
int translate(char *in,int size)
{
int i;
for(i=0;i<size;i++)
{
int target[NB_CELL_P_L];
int r;
for(r=0;r<NB_CELL_P_L;r++)
out[0][r]=(((int)in[i])>>r) & 0x01;
propagate();
in[i]=0;
for(r=0;r<8;r++)
{
in[i]=in[i]+(((out[NB_LAYER][r] > 0.5) ? 1 : 0) << r);
}
}
return(0);
}
#ifdef __WIN32__
int _tmain()
#else
int main()
#endif
{
char sent[4096];
int exit;
printf("Initialising coefs ...\n\r");
init();
printf("Learning from DB ...\n\r");
learning();
printf("Processing user data (TO DO)\n\r");
exit=0;
while(!exit)
{
printf("\n\rPlease enter a sentence in small letters, or 'done' to finish :\n\r");
gets(sent);
fflush(stdin);
if(!strcmp(sent,"done"))
exit=1;
translate(sent,strlen(sent));
printf("The translation to capitals is:\n\r%s\n\r",sent);
}
printf("Bye\n\r");
#ifdef __WIN32__
printf("Press a key\n\r");
getchar();
#endif
return 0;
}
--------------000404010309090308010308--
|
|
0
|
|
|
|
Reply
|
nntp_pipex (70)
|
11/5/2004 2:42:11 AM
|
|
"DEMAINE Benoit-Pierre" <nntp_pipex@demaine.info> wrote
>
> For my course, I am asked to write a NN program.
> For now, I just want it to be able to be able to convert small letters
> into capitals.
>
It is relatively easy to get a neural network to work on a toy problem, much
harder to actually solve unknown problems.
>
> I use two layers of 32 cells, with a learning rate of 0.001, and do
> 10000 cycles at learning stage.
>
> Are values for learning rate and learning stage ok ?
>
Normally you have "training data" and "test data". train the net on the
training data, and cease training when performance ceases to improve on the
test data. The avoids over-fitting.
No-one can answer the learning rate question without seeing the training
algorithm you are using, but 0.01 - 0.001 is generally pretty reasonable.
>
> when inputing the data in the first layer, should I the same data into
> all 32 cells, or should I quantize the input into values, for example
> converting the data into 32bit data, and put one bit in each cell ?
>
> Actually, I input characters, which are 8bit datas ... if I have to give
> only one bit per cell, shall I reduce the number of cells to 8, or shall
> I give the same bit to several cells ?
>
You need to code data in relevant way. For instance, if you have red, green,
blue and yellow flowers and you are trying to classify them by species, the
coding colour as one channel with red = 0.1, green = 0.2, and so on won't
give god results. OTOH if you are classifying stars into type, then coding
the colour according to frequency would make sense.
>
> for the connection between layers 1 and 2, should the data be
> transmitted as floats ( floating point data), integers, or boleans( only
> values of 0 or 1) ? in which range of numbers ?
>
Simplify the alogorithm by transmitting everything as floats. many people
like to use 0.1 to code 0 and 0.9 to code 1, to avoid the problems of values
getting too close to zero.
>
> The text of the exercise in licensed by the university, thus I may not
> past it in here. It just gives some formulas about the activation unit,
> the error rate calculation, and the correction of error during the
> learning stage.
>
> for those who are ready to help more, I attach the source code I wrote
> during the tutarial, within 2h30 ... I expect it to have many many
> mistakes about NN :( please dont shout at me :/
>
> I am sorry for the flat variable space, but to write code fast, I use
> _short_ names ...
>
|
|
0
|
|
|
|
Reply
|
Malcolm
|
11/6/2004 12:12:06 PM
|
|
|
2 Replies
28 Views
(page loaded in 2.219 seconds)
|