COMPGROUPS.NET | Search | Post Question | Groups | Stream | About | Register

### how can i compare two speech signal by using FFT or DCT

• Follow

```i want to compare two speech signal by using fft or dct..how can i do that,any idea ?
```
 0

```"andandgui isler" <bosisler_ist@hotmail.com> wrote in message <i8ap50\$jkk\$1@fred.mathworks.com>...
> i want to compare two speech signal by using fft or dct..how can i do that,any idea ?

nobody knows !!  ??
```
 0

```What is the purpose?
What do you mean by the word "compare" ? The fundamental frequency? Or is it speech/speaker recognition?

FFT/STFT/DCT is used to get frames and later to computate features. These features then can be classified.
So, for the "comparison" you might need a classifier (Maximum Likelihood, Bayes, ...), for speech recognition HMMs are often used.
```
 0

```"Janis Doebler" <janis.doebler@tu-berlin.de> wrote in message <i8k4n9\$i8\$1@fred.mathworks.com>...
> What is the purpose?
> What do you mean by the word "compare" ? The fundamental frequency? Or is it speech/speaker recognition?
>
> FFT/STFT/DCT is used to get frames and later to computate features. These features then can be classified.
> So, for the "comparison" you might need a classifier (Maximum Likelihood, Bayes, ...), for speech recognition HMMs are often used.

i want to compare two speech signal to get their likelihood. For example, at the beginning i will record my voice to Matlab when i say ' stop ' . Then when i say stop again in a silent room, Matlab gives me ' ok,"stop" is recognized '.. This is the basic shape of my thought. I will develop my code by beginning from this step.
```
 0

```Are all the words as short as "stop"?  So, you are going to create a single-word speech recognizer?
For longer words it might be better to have a database of phonema. Speech recognition and Hidden-Markov-Models are helpful.

It's also recommended to use mel-Spectrum when using speech signals, means converting the frequency to mel-frequency.

Maybe it works for short words just to make frames (STFT) and using
features (e.g. dct coefficient) to classify them. But you'll need training samples to get the mean-value and variance of each feature.

Depending on the length of word, the classes might be words or phonema. For phonema you have to "put classes together" for a word ("state2word"; HMM).
```
 0

```"Janis Doebler" <janis.doebler@tu-berlin.de> wrote in message <i8kht2\$iii\$1@fred.mathworks.com>...
> Are all the words as short as "stop"?  So, you are going to create a single-word speech recognizer?
> For longer words it might be better to have a database of phonema. Speech recognition and Hidden-Markov-Models are helpful.
>
> It's also recommended to use mel-Spectrum when using speech signals, means converting the frequency to mel-frequency.
>
> Maybe it works for short words just to make frames (STFT) and using
> features (e.g. dct coefficient) to classify them. But you'll need training samples to get the mean-value and variance of each feature.
>
> Depending on the length of word, the classes might be words or phonema. For phonema you have to "put classes together" for a word ("state2word"; HMM).

all words will be short, like "left, right, go, stop" . I have to use FFT when i analyze them.
```
 0

```
"andandgui isler" <bosisler_ist@hotmail.com> wrote in message
news:i8kjun\$a43\$1@fred.mathworks.com...
> "Janis Doebler" <janis.doebler@tu-berlin.de> wrote in message
> <i8kht2\$iii\$1@fred.mathworks.com>...
>> Are all the words as short as "stop"?  So, you are going to create a
>> single-word speech recognizer? For longer words it might be better to
>> have a database of phonema. Speech recognition and Hidden-Markov-Models
>>
>> It's also recommended to use mel-Spectrum when using speech signals,
>> means converting the frequency to mel-frequency.
>>
>> Maybe it works for short words just to make frames (STFT) and using
>> features (e.g. dct coefficient) to classify them. But you'll need
>> training samples to get the mean-value and variance of each feature.
>> Depending on the length of word, the classes might be words or phonema.
>> For phonema you have to "put classes together" for a word ("state2word";
>> HMM).
>
>
> all words will be short, like "left, right, go, stop" . I have to use FFT
> when i analyze them.

In general, this is not an easy problem, and one that not even human beings
have mastered.  For instance, while this is a tongue-twister that's hard to
pronounce, it can also be hard to understand if you just hear it rather than

The sixth sick sheik's sixth sheep's sick.

And then of course there's something like:

I want to send two tutus to Tucson.

--
Steve Lord
slord@mathworks.com
comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ
http://www.mathworks.com

```
 0

```"andandgui isler" <bosisler_ist@hotmail.com> wrote in message <i8kf4v\$dn0\$1@fred.mathworks.com>...
> "Janis Doebler" <janis.doebler@tu-berlin.de> wrote in message <i8k4n9\$i8\$1@fred.mathworks.com>...
> > What is the purpose?
> > What do you mean by the word "compare" ? The fundamental frequency? Or is it speech/speaker recognition?
> >
> > FFT/STFT/DCT is used to get frames and later to computate features. These features then can be classified.
> > So, for the "comparison" you might need a classifier (Maximum Likelihood, Bayes, ...), for speech recognition HMMs are often used.
>
>
> i want to compare two speech signal to get their likelihood. For example, at the beginning i will record my voice to Matlab when i say ' stop ' . Then when i say stop again in a silent room, Matlab gives me ' ok,"stop" is recognized '.. This is the basic shape of my thought. I will develop my code by beginning from this step.

You may find this demo useful:

It uses the DAQ Toolbox to acquire the data, but you can ignore that part and look at the analysis method.

Wayne
```
 0

7 Replies
467 Views

Similiar Articles:

7/25/2012 4:17:23 AM