Hi, I have a problem for which I'm not sure some sort of
machine learning would be appropriate.
Suppose I have a dataset consisting of thousands of tuples
and a score for each tuple which was determined through an
unknown process. Now, for a new input tuple, I want
to get a new predicted score. I have already sifted through
the data and have determined there is no clear mapping
between the tuple and the score; two tuples with the same
data may have differing scores.
For example, the tuple could consist of a person's age,
address, gender, favourite book. There has already been
a score associated for each tuple through some means.
Now I want to input myself and predict my score (rather
than running it through the scoring process, which is
not what I want).
What could help me here? I was looking at a neural
network implementation. I have never taken AI or
machine learning classes, so some pointers to websites
or books would be appreciated.
|
|
0
|
|
|
|
Reply
|
digital_puer (190)
|
8/16/2007 5:08:20 PM |
|
On Aug 16, 12:08 pm, Digital Puer <digital_p...@hotmail.com> wrote:
> Hi, I have a problem for which I'm not sure some sort of
> machine learning would be appropriate.
>
> Suppose I have a dataset consisting of thousands of tuples
> and a score for each tuple which was determined through an
> unknown process. Now, for a new input tuple, I want
> to get a new predicted score. I have already sifted through
> the data and have determined there is no clear mapping
> between the tuple and the score; two tuples with the same
> data may have differing scores.
So the score isn't based on the tuple. There's no learning
process that can help you. As far as the tuples are concerned,
the scores are inconsistent.
>
> For example, the tuple could consist of a person's age,
> address, gender, favourite book. There has already been
> a score associated for each tuple through some means.
> Now I want to input myself and predict my score (rather
> than running it through the scoring process, which is
> not what I want).
>
> What could help me here? I was looking at a neural
> network implementation. I have never taken AI or
> machine learning classes, so some pointers to websites
> or books would be appreciated.
|
|
0
|
|
|
|
Reply
|
mensanator
|
8/16/2007 5:27:45 PM
|
|
On Aug 16, 10:27 am, "mensana...@aol.com" <mensana...@aol.com> wrote:
> On Aug 16, 12:08 pm, Digital Puer <digital_p...@hotmail.com> wrote:
>
> > Hi, I have a problem for which I'm not sure some sort of
> > machine learning would be appropriate.
>
> > Suppose I have a dataset consisting of thousands of tuples
> > and a score for each tuple which was determined through an
> > unknown process. Now, for a new input tuple, I want
> > to get a new predicted score. I have already sifted through
> > the data and have determined there is no clear mapping
> > between the tuple and the score; two tuples with the same
> > data may have differing scores.
>
> So the score isn't based on the tuple. There's no learning
> process that can help you. As far as the tuples are concerned,
> the scores are inconsistent.
>
I should have said that these situations are rather rare.
They are outliers.
Generally speaking, there is some sort of pattern that maps
the tuples to the scores, but I cannot easily see what it is.
It is certainly is not a linear relationship between any one or
two of the tuple data points and the scores.
>
> > For example, the tuple could consist of a person's age,
> > address, gender, favourite book. There has already been
> > a score associated for each tuple through some means.
> > Now I want to input myself and predict my score (rather
> > than running it through the scoring process, which is
> > not what I want).
>
> > What could help me here? I was looking at a neural
> > network implementation. I have never taken AI or
> > machine learning classes, so some pointers to websites
> > or books would be appreciated.
|
|
0
|
|
|
|
Reply
|
Digital
|
8/16/2007 6:04:54 PM
|
|
Digital Puer wrote:
> Suppose I have a dataset consisting of thousands of tuples
> and a score for each tuple which was determined through an
> unknown process. Now, for a new input tuple, I want
> to get a new predicted score. I have already sifted through
> the data and have determined there is no clear mapping
> between the tuple and the score; two tuples with the same
> data may have differing scores.
> What could help me here? I was looking at a neural
> network implementation. I have never taken AI or
> machine learning classes, so some pointers to websites
> or books would be appreciated.
This looks like a possible application to radial basis function
networks, assuming that your target function somehow depends smoothly on
the input data.
You can start reading here:
http://en.wikipedia.org/wiki/Radial_basis_function
Christian
|
|
0
|
|
|
|
Reply
|
Christian
|
8/17/2007 1:32:00 PM
|
|
On Aug 17, 2:32 pm, Christian Gollwitzer <Christian.Gollwit...@uni-
bayreuth.de> wrote:
> Digital Puer wrote:
> > Suppose I have a dataset consisting of thousands of tuples
> > and a score for each tuple which was determined through an
> > unknown process. Now, for a new input tuple, I want
> > to get a new predicted score. I have already sifted through
> > the data and have determined there is no clear mapping
> > between the tuple and the score; two tuples with the same
> > data may have differing scores.
> > What could help me here? I was looking at a neural
> > network implementation. I have never taken AI or
> > machine learning classes, so some pointers to websites
> > or books would be appreciated.
>
> This looks like a possible application to radial basis function
> networks, assuming that your target function somehow depends smoothly on
> the input data.
> You can start reading here:
>
> http://en.wikipedia.org/wiki/Radial_basis_function
>
A good suggestion.
For a quick (implementation) solution the o.p. might even try nearest
neighbour. Let's say the tuples and scores (values) are doubles.
vector = double []
double nn(vector tup){
vector[] trainTup;
double[] trainVal;
find nearest to tup in train -- indexNN
return trainVal[indexNN)
}
'nearest' might use a Euclidean metric, but maybe not -- depends on
the data; and if not, the problem may get more difficult. Radial basis
functions may assume something like a Euclidean metric. The nice thing
about NN and RBF is that whilst they qualify as 'machine learning',
the 'learning' consists mainly of memorising.
You might want to do something about the conflicting data; for example
averaging conflicting scores at a specific tuple value; but maybe that
means that the tuples contain integer components?
Also, you might need to 'standardise' the data -- so that each
component of the tuple contributes the same to a distance.
The first two books by Masters would be a good starting point, plus
the Bishop book. There's a rather nice new book by Bishop, see
http://research.microsoft.com/%7Ecmbishop/PRML/.
The following also might help
http://research.microsoft.com/~minka/statlearn/glossary/
and the newsgroup comp.ai.neural-nets. I suppose I should set follow-
ups to there but I'm never ceertain of the netiquette of that.
Best regards,
Jon C.
@Book{masters-pnnr93,
author = "T. Masters",
title = "Practical Neural Network Recipes in C++",
publisher = "Academic Press",
address = "London",
year = "1993"
} (the original, maybe basic enough to ignore, but maybe there are
tantalising references to it from [masters-aann95])
@Book{masters-aann95,
author = "T. Masters",
title = "Advanced Algorithms for Neural Networks: a C++
sourcebook",
publisher = "John Wiley \& Sons",
address = "New York",
year = "1995"
} (covers PNN and GRNN (roughly RBF)
@Book{masters-sipnn,
author = "T. Masters",
title = "Signal and Image Processing with Neural Networks: C+
+
sourcebook",
publisher = "John Wiley \& Sons",
address = "New York",
year = "1995"
} (a lot on wavelets, Fourier transforms, texture, shape; not I have
not used it)
@Book{masters-nnh95,
author = "T. Masters",
title = "Neural, Novel & Hybrid Algorithms for Time Series
Prediction",
publisher = "John Wiley \& Sons",
address = "New York",
year = "1995"
}
@Book{bishop95,
author = "C.M. Bishop",
title = "Neural Networks for Pattern Recognition",
publisher = "Oxford University Press",
address = "Oxford, U.K.",
year = "1995"}
|
|
0
|
|
|
|
Reply
|
jg
|
8/17/2007 5:17:10 PM
|
|
On Aug 16, 1:08 pm, Digital Puer <digital_p...@hotmail.com> wrote:
> Hi, I have a problem for which I'm not sure some sort ofmachine learningwould be appropriate.
>
> Suppose I have a dataset consisting of thousands of tuples
> and a score for each tuple which was determined through an
> unknown process. Now, for a new input tuple, I want
> to get a new predicted score. I have already sifted through
> the data and have determined there is no clear mapping
> between the tuple and the score; two tuples with the same
> data may have differing scores.
>
> For example, the tuple could consist of a person's age,
> address, gender, favourite book. There has already been
> a score associated for each tuple through some means.
> Now I want to input myself and predict my score (rather
> than running it through the scoring process, which is
> not what I want).
>
> What could help me here? I was looking at a neural
> network implementation. I have never taken AI ormachine learningclasses, so some pointers to websites
> or books would be appreciated.
Any of a number of modeling processes might approximate this mapping,
such as neural networks, logistic regression, k-nearest neighbors,
etc. Which one works best for your particular problem would need to
be determined experimentally. Obviously, the existence of
observations which are inconsistent within the available input
variables means that a perfect approximation of the original mapping
is impossible, but that does not preclude a useful approximation.
Empirical modeling has been developed under a number of labels, so you
may find what you're looking for under "data mining", "supervised
learning", "inferential statistics", "machine learning", "pattern
recognition", etc.
More help could be provided if you could explain more specifically.
Some items of interest would be:
-How long are the tuples?
-What data types (numeric, categorical, etc.) comprise the tuples?
-How many historical observations do you have?
-What modeling tools (a statistical package, writing your own in Java,
etc.) do you have at your disposal?
-Will Dwinnell
http://matlabdatamining.blogspot.com/
|
|
0
|
|
|
|
Reply
|
Predictor
|
8/18/2007 11:21:49 AM
|
|
On Aug 16, 1:27 pm, "mensana...@aol.com" <mensana...@aol.com> wrote:
> On Aug 16, 12:08 pm, Digital Puer <digital_p...@hotmail.com> wrote:
>
> > Hi, I have a problem for which I'm not sure some sort of
> >machine learningwould be appropriate.
>
> > Suppose I have a dataset consisting of thousands of tuples
> > and a score for each tuple which was determined through an
> > unknown process. Now, for a new input tuple, I want
> > to get a new predicted score. I have already sifted through
> > the data and have determined there is no clear mapping
> > between the tuple and the score; two tuples with the same
> > data may have differing scores.
>
> So the score isn't based on the tuple. There's no learning
> process that can help you. As far as the tuples are concerned,
> the scores are inconsistent.
While it's true that inconsistent data prohibits a perfect
approximation, that is rarely the goal in predictive modeling. In
some situations, only a small improvement over chance ("guessing") is
quite profitable. Being correct only 60% of the time in predicting
red/black in roulette, for example, would be very useful. Naturally,
the larger the proportion of inconsistent observations and the more
dispersed the outcomes are of those inconsistent observations, the
worse the upper limit on possible performance.
-Will Dwinnell
http://matlabdatamining.blogspot.com/
|
|
0
|
|
|
|
Reply
|
Predictor
|
8/18/2007 11:27:33 AM
|
|
On Aug 16, 1:27 pm, "mensana...@aol.com" <mensana...@aol.com> wrote:
> On Aug 16, 12:08 pm, Digital Puer <digital_p...@hotmail.com> wrote:
>
> > Hi, I have a problem for which I'm not sure some sort of
> >machine learningwould be appropriate.
>
> > Suppose I have a dataset consisting of thousands of tuples
> > and a score for each tuple which was determined through an
> > unknown process. Now, for a new input tuple, I want
> > to get a new predicted score. I have already sifted through
> > the data and have determined there is no clear mapping
> > between the tuple and the score; two tuples with the same
> > data may have differing scores.
>
> So the score isn't based on the tuple. There's no learning
> process that can help you. As far as the tuples are concerned,
> the scores are inconsistent.
While it's true that inconsistent data prohibits a perfect
approximation, that is rarely the goal in predictive modeling. In
some situations, only a small improvement over chance ("guessing") is
quite profitable. Being correct only 60% of the time in predicting
red/black in roulette, for example, would be very useful. Naturally,
the larger the proportion of inconsistent observations and the more
dispersed the outcomes are of those inconsistent observations, the
worse the upper limit on possible performance.
-Will Dwinnell
http://matlabdatamining.blogspot.com/
|
|
0
|
|
|
|
Reply
|
Predictor
|
8/18/2007 11:28:35 AM
|
|
|
7 Replies
177 Views
(page loaded in 0.113 seconds)
Similiar Articles: distance protection in simulink - comp.soft-sys.matlabWould machine learning help here? - comp.programming... of the tuple contributes the same to a distance. ... or Simulink 0 5 Hi, I'm beginner in MATLAB and Simulink ... any classifier for digit recognition is needed - comp.soft-sys ...please we need urgently a code for SVM "support vector machine ... You can find a variety of algorithms here: http ... takes in the ... regression, kernel, machine learning ... Best Assembler for 8-bit Apples? - comp.sys.apple2.programmer ...I'm learning assembly language using Roger Wagner ... source and object code to a modern machine (ie ... is a safe bet since there's plenty of help available for it here. Connection refused problem - comp.unix.programmerHi to all, I started learning socket programming, and ... Here is the code: Server > #include <sys/types.h ... sends back to the > client the time of the server machine ... Extended Kalman Filter - comp.soft-sys.matlab... filter.. and its also works as not bad now in ... need help for unscented kalman filter - comp.soft-sys ... used in a number of nonlinear estimation and machine learning ... Learning Ideal VS MASM syntax - comp.lang.asm.x86Now, is there a way to use my book ... The "principles" - the machine code we ... syntax - comp.lang.asm.x86 NEED HELP : MASM INSTRUCTIONS - comp.lang.asm.x86 Learning ... tanh or logistic - comp.ai.neural-netsThe Gaussian Process for Machine Learning toolbox ... be used to prevent the prolonged learning phenomenon. Anyway, I think I now ... Moreover, since MATLAB doesn't support ... Large-Scale C++ Software Design - comp.lang.c++.moderated ...... anyone documented what parts of this book are now... ... Many wrote a book as a one-time let-me-help-by ... encyclopedia Shogun, an open source Large Scale Machine Learning ... IDL routine to read MATLAB MAT-files - comp.lang.idl-pvwave ...I'm now working on support for MAT-files written on big endian ... available through GitHub, I'm still learning git at ... big- endian machine) on a little-endian machine ... Compare FMP with Paradox and Access - comp.databases.filemaker ...... no more than three tables) I'm using Paradox 8 now ... There is something of a learning curve for mastering the ... but Setup file won't run... > not an XP or 2000 machine ... Machine learning - Wikipedia, the free encyclopediaMachine learning, a branch of artificial intelligence, is a ... the positive and none of the negative examples. Support ... What links here; Related changes; Upload file; Special ... Support vector machine - Wikipedia, the free encyclopediaIn machine learning, support vector machines (SVMs, also support vector networks) are supervised ... Here, in addition to the training set , the learner is also given a set 7/14/2012 2:17:04 PM
|