i want to compare two speech signal by using fft or dct..how can i do that,any idea ?
|
|
0
|
|
|
|
Reply
|
andandgui
|
10/3/2010 8:30:24 PM |
|
"andandgui isler" <bosisler_ist@hotmail.com> wrote in message <i8ap50$jkk$1@fred.mathworks.com>...
> i want to compare two speech signal by using fft or dct..how can i do that,any idea ?
nobody knows !! ??
|
|
0
|
|
|
|
Reply
|
andandgui
|
10/7/2010 7:51:07 AM
|
|
What is the purpose?
What do you mean by the word "compare" ? The fundamental frequency? Or is it speech/speaker recognition?
FFT/STFT/DCT is used to get frames and later to computate features. These features then can be classified.
So, for the "comparison" you might need a classifier (Maximum Likelihood, Bayes, ...), for speech recognition HMMs are often used.
|
|
0
|
|
|
|
Reply
|
Janis
|
10/7/2010 9:43:05 AM
|
|
"Janis Doebler" <janis.doebler@tu-berlin.de> wrote in message <i8k4n9$i8$1@fred.mathworks.com>...
> What is the purpose?
> What do you mean by the word "compare" ? The fundamental frequency? Or is it speech/speaker recognition?
>
> FFT/STFT/DCT is used to get frames and later to computate features. These features then can be classified.
> So, for the "comparison" you might need a classifier (Maximum Likelihood, Bayes, ...), for speech recognition HMMs are often used.
i want to compare two speech signal to get their likelihood. For example, at the beginning i will record my voice to Matlab when i say ' stop ' . Then when i say stop again in a silent room, Matlab gives me ' ok,"stop" is recognized '.. This is the basic shape of my thought. I will develop my code by beginning from this step.
|
|
0
|
|
|
|
Reply
|
andandgui
|
10/7/2010 12:41:03 PM
|
|
Are all the words as short as "stop"? So, you are going to create a single-word speech recognizer?
For longer words it might be better to have a database of phonema. Speech recognition and Hidden-Markov-Models are helpful.
It's also recommended to use mel-Spectrum when using speech signals, means converting the frequency to mel-frequency.
Maybe it works for short words just to make frames (STFT) and using
features (e.g. dct coefficient) to classify them. But you'll need training samples to get the mean-value and variance of each feature.
Depending on the length of word, the classes might be words or phonema. For phonema you have to "put classes together" for a word ("state2word"; HMM).
|
|
0
|
|
|
|
Reply
|
Janis
|
10/7/2010 1:28:03 PM
|
|
"Janis Doebler" <janis.doebler@tu-berlin.de> wrote in message <i8kht2$iii$1@fred.mathworks.com>...
> Are all the words as short as "stop"? So, you are going to create a single-word speech recognizer?
> For longer words it might be better to have a database of phonema. Speech recognition and Hidden-Markov-Models are helpful.
>
> It's also recommended to use mel-Spectrum when using speech signals, means converting the frequency to mel-frequency.
>
> Maybe it works for short words just to make frames (STFT) and using
> features (e.g. dct coefficient) to classify them. But you'll need training samples to get the mean-value and variance of each feature.
>
> Depending on the length of word, the classes might be words or phonema. For phonema you have to "put classes together" for a word ("state2word"; HMM).
all words will be short, like "left, right, go, stop" . I have to use FFT when i analyze them.
|
|
0
|
|
|
|
Reply
|
andandgui
|
10/7/2010 2:03:03 PM
|
|
"andandgui isler" <bosisler_ist@hotmail.com> wrote in message
news:i8kjun$a43$1@fred.mathworks.com...
> "Janis Doebler" <janis.doebler@tu-berlin.de> wrote in message
> <i8kht2$iii$1@fred.mathworks.com>...
>> Are all the words as short as "stop"? So, you are going to create a
>> single-word speech recognizer? For longer words it might be better to
>> have a database of phonema. Speech recognition and Hidden-Markov-Models
>> are helpful.
>>
>> It's also recommended to use mel-Spectrum when using speech signals,
>> means converting the frequency to mel-frequency.
>>
>> Maybe it works for short words just to make frames (STFT) and using
>> features (e.g. dct coefficient) to classify them. But you'll need
>> training samples to get the mean-value and variance of each feature.
>> Depending on the length of word, the classes might be words or phonema.
>> For phonema you have to "put classes together" for a word ("state2word";
>> HMM).
>
>
> all words will be short, like "left, right, go, stop" . I have to use FFT
> when i analyze them.
In general, this is not an easy problem, and one that not even human beings
have mastered. For instance, while this is a tongue-twister that's hard to
pronounce, it can also be hard to understand if you just hear it rather than
being able to read it:
The sixth sick sheik's sixth sheep's sick.
And then of course there's something like:
I want to send two tutus to Tucson.
--
Steve Lord
slord@mathworks.com
comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ
To contact Technical Support use the Contact Us link on
http://www.mathworks.com
|
|
0
|
|
|
|
Reply
|
Steven_Lord
|
10/7/2010 4:09:53 PM
|
|
"andandgui isler" <bosisler_ist@hotmail.com> wrote in message <i8kf4v$dn0$1@fred.mathworks.com>...
> "Janis Doebler" <janis.doebler@tu-berlin.de> wrote in message <i8k4n9$i8$1@fred.mathworks.com>...
> > What is the purpose?
> > What do you mean by the word "compare" ? The fundamental frequency? Or is it speech/speaker recognition?
> >
> > FFT/STFT/DCT is used to get frames and later to computate features. These features then can be classified.
> > So, for the "comparison" you might need a classifier (Maximum Likelihood, Bayes, ...), for speech recognition HMMs are often used.
>
>
> i want to compare two speech signal to get their likelihood. For example, at the beginning i will record my voice to Matlab when i say ' stop ' . Then when i say stop again in a silent room, Matlab gives me ' ok,"stop" is recognized '.. This is the basic shape of my thought. I will develop my code by beginning from this step.
You may find this demo useful:
http://www.mathworks.com/company/newsletters/digest/2010/jan/word-recognition-system-matlab.html?s_cid=MLD0110naTA2&s_v1=6977184_1-7ZR2N6
It uses the DAQ Toolbox to acquire the data, but you can ignore that part and look at the analysis method.
Wayne
|
|
0
|
|
|
|
Reply
|
Wayne
|
10/7/2010 6:12:04 PM
|
|
|
7 Replies
467 Views
(page loaded in 3.681 seconds)
|