LPC for Analysis of 2 Speech Signals

  • Follow


Hi All,

I am trying to find the best way to analyze 2 speech signals from 2
different speakers and come up with a 'match percentage' as to how
close they were (ie, did the speakers say the same thing).

I have been reading that LPC is great for encoding the basic
parameters of speech, but most of the articles are related to
compression or building of a vocoder.  Has anyone had experience of or
know any good resources or c/c++ libraries for extracting the
parameters of speech for comparison?  Is LPC likely the best algorithm
for this type of work?

Thanks
Ray


0
Reply raeldor (34) 11/28/2010 12:45:48 AM

On Nov 28, 1:45=A0pm, Raeldor <rael...@gmail.com> wrote:
> Hi All,
>
> I am trying to find the best way to analyze 2 speech signals from 2
> different speakers and come up with a 'match percentage' as to how
> close they were (ie, did the speakers say the same thing).
>
> I have been reading that LPC is great for encoding the basic
> parameters of speech, but most of the articles are related to
> compression or building of a vocoder. =A0Has anyone had experience of or
> know any good resources or c/c++ libraries for extracting the
> parameters of speech for comparison? =A0Is LPC likely the best algorithm
> for this type of work?
>
> Thanks
> Ray

Try Matlab.
0
Reply HardySpicer 11/28/2010 3:08:46 AM


On Nov 27, 7:08=A0pm, HardySpicer <gyansor...@gmail.com> wrote:
> On Nov 28, 1:45=A0pm, Raeldor <rael...@gmail.com> wrote:
>
> > Hi All,
>
> > I am trying to find the best way to analyze 2 speech signals from 2
> > different speakers and come up with a 'match percentage' as to how
> > close they were (ie, did the speakers say the same thing).
>
> > I have been reading that LPC is great for encoding the basic
> > parameters of speech, but most of the articles are related to
> > compression or building of a vocoder. =A0Has anyone had experience of o=
r
> > know any good resources or c/c++ libraries for extracting the
> > parameters of speech for comparison? =A0Is LPC likely the best algorith=
m
> > for this type of work?
>
> > Thanks
> > Ray
>
> Try Matlab.

Not sure that's an option, as I need to integrate the end results into
a practical application.  I have read about Matlab's DSP toolkit, and
it looks really nice, but unless they provide source code or a c/c++
library for the functions it'll be very hard to translate it into my
application.
0
Reply Raeldor 11/28/2010 4:33:47 AM

i was working on filterbank approach to speech recognition.

This project was on isolated word recognition. So a user will
speak a set of words (example 1 to 10) and speak it for 8 times
to perform template averaging.

Templates were computed using filter banks, LPF's and decimation.
The 2D stored templates were then compared with the incoming 
word sample to find the match.

I have working c-code which was also ported on C6713 TI dSP platform.

Willing to sell the code if anyone is interested to buy it.

Regards
Bharat
0
Reply bharat 11/28/2010 1:09:37 PM

On Nov 27, 7:45=A0pm, Raeldor <rael...@gmail.com> wrote:
> Hi All,
>
> I am trying to find the best way to analyze 2 speech signals from 2
> different speakers and come up with a 'match percentage' as to how
> close they were (ie, did the speakers say the same thing).
>
> I have been reading that LPC is great for encoding the basic
> parameters of speech, but most of the articles are related to
> compression or building of a vocoder. =A0Has anyone had experience of or
> know any good resources or c/c++ libraries for extracting the
> parameters of speech for comparison? =A0Is LPC likely the best algorithm
> for this type of work?
>
> Thanks
> Ray

You can use either dynamic time warping or hidden markov models. You
may use either the LPC coefs or convert them to PARCORs in either of
these two cost mechanisms.

Clay
0
Reply Clay 11/29/2010 9:40:24 PM


Clay wrote:

> On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote:
> 
>>Hi All,
>>
>>I am trying to find the best way to analyze 2 speech signals from 2
>>different speakers and come up with a 'match percentage' as to how
>>close they were (ie, did the speakers say the same thing).
>>
>>I have been reading that LPC is great for encoding the basic
>>parameters of speech, but most of the articles are related to
>>compression or building of a vocoder.  Has anyone had experience of or
>>know any good resources or c/c++ libraries for extracting the
>>parameters of speech for comparison?  Is LPC likely the best algorithm
>>for this type of work?
>>
>>Thanks
>>Ray
> 
> 
> You can use either dynamic time warping or hidden markov models. You
> may use either the LPC coefs or convert them to PARCORs in either of
> these two cost mechanisms.

Problem is, there is not much relation between the perceptual similarity 
and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are 
good functions for pattern matching; LSPs or cepstrum coefficients could 
be better in this regard. But, it's all minor technicalities compared to 
the incredible problem stated by OP.


Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com
0
Reply Vladimir 11/29/2010 11:12:19 PM

On Nov 29, 3:12=A0pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
> Clay wrote:
> > On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote:
>
> >>Hi All,
>
> >>I am trying to find the best way to analyze 2 speech signals from 2
> >>different speakers and come up with a 'match percentage' as to how
> >>close they were (ie, did the speakers say the same thing).
>
> >>I have been reading that LPC is great for encoding the basic
> >>parameters of speech, but most of the articles are related to
> >>compression or building of a vocoder. =A0Has anyone had experience of o=
r
> >>know any good resources or c/c++ libraries for extracting the
> >>parameters of speech for comparison? =A0Is LPC likely the best algorith=
m
> >>for this type of work?
>
> >>Thanks
> >>Ray
>
> > You can use either dynamic time warping or hidden markov models. You
> > may use either the LPC coefs or convert them to PARCORs in either of
> > these two cost mechanisms.
>
> Problem is, there is not much relation between the perceptual similarity
> and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are
> good functions for pattern matching; LSPs or cepstrum coefficients could
> be better in this regard. But, it's all minor technicalities compared to
> the incredible problem stated by OP.
>
> Vladimir Vassilevsky
> DSP and Mixed Signal Design Consultanthttp://www.abvolt.com

Maybe I should explain where I am at the moment.

I am calculating the FFT of a 128-sample windowed using gaussian
distribution, then converting this to decibels.  I can see the peaks
of the formants visually, but there is a lot of noise on the graph (of
the data), I think probably caused by the harmonics.  Is there a way
to clean this noise so I can see the formant peaks as smooth peaks in
the graph?  The smaller (128-bit) sample size helped with this, as did
the gaussian window, but I can't help but think there is a better
approach for this?

Thanks
Ray

0
Reply Raeldor 11/30/2010 12:24:20 AM

On Nov 29, 3:12=A0pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
> Clay wrote:
> > On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote:
>
> >>Hi All,
>
> >>I am trying to find the best way to analyze 2 speech signals from 2
> >>different speakers and come up with a 'match percentage' as to how
> >>close they were (ie, did the speakers say the same thing).
>
> >>I have been reading that LPC is great for encoding the basic
> >>parameters of speech, but most of the articles are related to
> >>compression or building of a vocoder. =A0Has anyone had experience of o=
r
> >>know any good resources or c/c++ libraries for extracting the
> >>parameters of speech for comparison? =A0Is LPC likely the best algorith=
m
> >>for this type of work?
>
> >>Thanks
> >>Ray
>
> > You can use either dynamic time warping or hidden markov models. You
> > may use either the LPC coefs or convert them to PARCORs in either of
> > these two cost mechanisms.
>
> Problem is, there is not much relation between the perceptual similarity
> and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are
> good functions for pattern matching; LSPs or cepstrum coefficients could
> be better in this regard. But, it's all minor technicalities compared to
> the incredible problem stated by OP.
>
> Vladimir Vassilevsky
> DSP and Mixed Signal Design Consultanthttp://www.abvolt.com

Maybe I should explain where I am at the moment.
I am calculating the FFT of a 128-sample windowed using gaussian
distribution, then converting this to decibels.  I can see the peaks
of the formants visually, but there is a lot of noise on the graph
(of
the data), I think probably caused by the harmonics.  Is there a way
to clean this noise so I can see the formant peaks as smooth peaks in
the graph?  The smaller (128-bit) sample size helped with this, as
did
the gaussian window, but I can't help but think there is a better
approach for this?

Thanks
Ray
0
Reply Raeldor 11/30/2010 4:17:47 PM


Raeldor wrote:

> On Nov 29, 3:12 pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
> 
>>Clay wrote:
>>
>>>On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote:
>>
>>>>Hi All,
>>
>>>>I am trying to find the best way to analyze 2 speech signals from 2
>>>>different speakers and come up with a 'match percentage' as to how
>>>>close they were (ie, did the speakers say the same thing).
>>
>>>>I have been reading that LPC is great for encoding the basic
>>>>parameters of speech, but most of the articles are related to
>>>>compression or building of a vocoder.  Has anyone had experience of or
>>>>know any good resources or c/c++ libraries for extracting the
>>>>parameters of speech for comparison?  Is LPC likely the best algorithm
>>>>for this type of work?
>>
>>>You can use either dynamic time warping or hidden markov models. You
>>>may use either the LPC coefs or convert them to PARCORs in either of
>>>these two cost mechanisms.
>>
>>Problem is, there is not much relation between the perceptual similarity
>>and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are
>>good functions for pattern matching; LSPs or cepstrum coefficients could
>>be better in this regard. But, it's all minor technicalities compared to
>>the incredible problem stated by OP.
>>
> Maybe I should explain where I am at the moment.
> I am calculating the FFT of a 128-sample windowed using gaussian
> distribution, then converting this to decibels.  I can see the peaks
> of the formants visually, but there is a lot of noise on the graph
> (of
> the data), I think probably caused by the harmonics.  Is there a way
> to clean this noise so I can see the formant peaks as smooth peaks in
> the graph?  The smaller (128-bit) sample size helped with this, as
> did
> the gaussian window, but I can't help but think there is a better
> approach for this?

Raeldor,

This is business. You can hire my services; contact at the web site. You 
can also consider filterbank TMS C67x software offered by Bharat Pathak.


Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com



0
Reply Vladimir 11/30/2010 4:53:08 PM

On Nov 27, 7:45=A0pm, Raeldor <rael...@gmail.com> wrote:
> Hi All,
>
> I am trying to find the best way to analyze 2 speech signals from 2
> different speakers and come up with a 'match percentage' as to how
> close they were (ie, did the speakers say the same thing).
>
> I have been reading that LPC is great for encoding the basic
> parameters of speech, but most of the articles are related to
> compression or building of a vocoder. =A0Has anyone had experience of or
> know any good resources or c/c++ libraries for extracting the
> parameters of speech for comparison? =A0Is LPC likely the best algorithm
> for this type of work?
>
> Thanks
> Ray

Google MFCC, PLP, DP, HMM etc. etc. etc

Go through about 1000 references

Better yet, forget about the whole thing: it's called ASR (automatic
speech recognition)

Google it too...

But, just for starters:

http://cslu.cse.ogi.edu/toolkit/
http://cmusphinx.sourceforge.net/
http://www.isip.piconepress.com/projects/speech/software/

This is all crap anyway...
0
Reply fatalist 12/1/2010 3:52:16 PM

On Nov 30, 11:53=A0am, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
> Raeldor,
>
> This is business. You can hire my services; contact at the web site. You
> can also consider filterbank TMS C67x software offered by Bharat Pathak.
>
> VladimirVassilevsky
> DSP and Mixed Signal Design Consultanthttp://www.abvolt.com

0
Reply ajmcgraw 12/2/2010 3:33:20 AM

On Dec 1, 7:52=A0am, fatalist <simfid...@gmail.com> wrote:
> On Nov 27, 7:45=A0pm, Raeldor <rael...@gmail.com> wrote:
>
> > Hi All,
>
> > I am trying to find the best way to analyze 2 speech signals from 2
> > different speakers and come up with a 'match percentage' as to how
> > close they were (ie, did the speakers say the same thing).
>
> > I have been reading that LPC is great for encoding the basic
> > parameters of speech, but most of the articles are related to
> > compression or building of a vocoder. =A0Has anyone had experience of o=
r
> > know any good resources or c/c++ libraries for extracting the
> > parameters of speech for comparison? =A0Is LPC likely the best algorith=
m
> > for this type of work?
>
> > Thanks
> > Ray
>
> Google MFCC, PLP, DP, HMM etc. etc. etc
>
> Go through about 1000 references
>
> Better yet, forget about the whole thing: it's called ASR (automatic
> speech recognition)
>
> Google it too...
>
> But, just for starters:
>
> http://cslu.cse.ogi.edu/toolkit/http://cmusphinx.sourceforge.net/http://w=
ww.isip.piconepress.com/projects/speech/software/
>
> This is all crap anyway...

Thank you for these links.  Looks like there's a lot of good info I
haven't seen yet.  I guess having the right terminology helps! :)
0
Reply Raeldor 12/10/2010 3:25:37 AM

11 Replies
253 Views

(page loaded in 0.179 seconds)

Similiar Articles:


















7/13/2012 2:05:57 PM


Reply: