how can i compare two speech signal by using FFT or DCT

  • Follow


i want to compare two speech signal by using fft or dct..how can i do that,any idea ?
0
Reply andandgui 10/3/2010 8:30:24 PM

"andandgui isler" <bosisler_ist@hotmail.com> wrote in message <i8ap50$jkk$1@fred.mathworks.com>...
> i want to compare two speech signal by using fft or dct..how can i do that,any idea ?


nobody knows !!  ??
0
Reply andandgui 10/7/2010 7:51:07 AM


What is the purpose?
What do you mean by the word "compare" ? The fundamental frequency? Or is it speech/speaker recognition?

FFT/STFT/DCT is used to get frames and later to computate features. These features then can be classified. 
So, for the "comparison" you might need a classifier (Maximum Likelihood, Bayes, ...), for speech recognition HMMs are often used.
0
Reply Janis 10/7/2010 9:43:05 AM

"Janis Doebler" <janis.doebler@tu-berlin.de> wrote in message <i8k4n9$i8$1@fred.mathworks.com>...
> What is the purpose?
> What do you mean by the word "compare" ? The fundamental frequency? Or is it speech/speaker recognition?
> 
> FFT/STFT/DCT is used to get frames and later to computate features. These features then can be classified. 
> So, for the "comparison" you might need a classifier (Maximum Likelihood, Bayes, ...), for speech recognition HMMs are often used.


i want to compare two speech signal to get their likelihood. For example, at the beginning i will record my voice to Matlab when i say ' stop ' . Then when i say stop again in a silent room, Matlab gives me ' ok,"stop" is recognized '.. This is the basic shape of my thought. I will develop my code by beginning from this step. 
0
Reply andandgui 10/7/2010 12:41:03 PM

Are all the words as short as "stop"?  So, you are going to create a single-word speech recognizer? 
For longer words it might be better to have a database of phonema. Speech recognition and Hidden-Markov-Models are helpful.

It's also recommended to use mel-Spectrum when using speech signals, means converting the frequency to mel-frequency.

Maybe it works for short words just to make frames (STFT) and using 
features (e.g. dct coefficient) to classify them. But you'll need training samples to get the mean-value and variance of each feature. 

Depending on the length of word, the classes might be words or phonema. For phonema you have to "put classes together" for a word ("state2word"; HMM).
0
Reply Janis 10/7/2010 1:28:03 PM

"Janis Doebler" <janis.doebler@tu-berlin.de> wrote in message <i8kht2$iii$1@fred.mathworks.com>...
> Are all the words as short as "stop"?  So, you are going to create a single-word speech recognizer? 
> For longer words it might be better to have a database of phonema. Speech recognition and Hidden-Markov-Models are helpful.
> 
> It's also recommended to use mel-Spectrum when using speech signals, means converting the frequency to mel-frequency.
> 
> Maybe it works for short words just to make frames (STFT) and using 
> features (e.g. dct coefficient) to classify them. But you'll need training samples to get the mean-value and variance of each feature. 
> 
> Depending on the length of word, the classes might be words or phonema. For phonema you have to "put classes together" for a word ("state2word"; HMM).


all words will be short, like "left, right, go, stop" . I have to use FFT when i analyze them. 
0
Reply andandgui 10/7/2010 2:03:03 PM


"andandgui isler" <bosisler_ist@hotmail.com> wrote in message 
news:i8kjun$a43$1@fred.mathworks.com...
> "Janis Doebler" <janis.doebler@tu-berlin.de> wrote in message 
> <i8kht2$iii$1@fred.mathworks.com>...
>> Are all the words as short as "stop"?  So, you are going to create a 
>> single-word speech recognizer? For longer words it might be better to 
>> have a database of phonema. Speech recognition and Hidden-Markov-Models 
>> are helpful.
>>
>> It's also recommended to use mel-Spectrum when using speech signals, 
>> means converting the frequency to mel-frequency.
>>
>> Maybe it works for short words just to make frames (STFT) and using 
>> features (e.g. dct coefficient) to classify them. But you'll need 
>> training samples to get the mean-value and variance of each feature. 
>> Depending on the length of word, the classes might be words or phonema. 
>> For phonema you have to "put classes together" for a word ("state2word"; 
>> HMM).
>
>
> all words will be short, like "left, right, go, stop" . I have to use FFT 
> when i analyze them.

In general, this is not an easy problem, and one that not even human beings 
have mastered.  For instance, while this is a tongue-twister that's hard to 
pronounce, it can also be hard to understand if you just hear it rather than 
being able to read it:

The sixth sick sheik's sixth sheep's sick.

And then of course there's something like:

I want to send two tutus to Tucson.

-- 
Steve Lord
slord@mathworks.com
comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ
To contact Technical Support use the Contact Us link on 
http://www.mathworks.com 

0
Reply Steven_Lord 10/7/2010 4:09:53 PM

"andandgui isler" <bosisler_ist@hotmail.com> wrote in message <i8kf4v$dn0$1@fred.mathworks.com>...
> "Janis Doebler" <janis.doebler@tu-berlin.de> wrote in message <i8k4n9$i8$1@fred.mathworks.com>...
> > What is the purpose?
> > What do you mean by the word "compare" ? The fundamental frequency? Or is it speech/speaker recognition?
> > 
> > FFT/STFT/DCT is used to get frames and later to computate features. These features then can be classified. 
> > So, for the "comparison" you might need a classifier (Maximum Likelihood, Bayes, ...), for speech recognition HMMs are often used.
> 
> 
> i want to compare two speech signal to get their likelihood. For example, at the beginning i will record my voice to Matlab when i say ' stop ' . Then when i say stop again in a silent room, Matlab gives me ' ok,"stop" is recognized '.. This is the basic shape of my thought. I will develop my code by beginning from this step. 

You may find this demo useful:

http://www.mathworks.com/company/newsletters/digest/2010/jan/word-recognition-system-matlab.html?s_cid=MLD0110naTA2&s_v1=6977184_1-7ZR2N6

It uses the DAQ Toolbox to acquire the data, but you can ignore that part and look at the analysis method.

Wayne
0
Reply Wayne 10/7/2010 6:12:04 PM

7 Replies
467 Views

(page loaded in 3.681 seconds)

Similiar Articles:








7/25/2012 4:17:23 AM


Reply: