Hi All,
I am trying to find the best way to analyze 2 speech signals from 2
different speakers and come up with a 'match percentage' as to how
close they were (ie, did the speakers say the same thing).
I have been reading that LPC is great for encoding the basic
parameters of speech, but most of the articles are related to
compression or building of a vocoder. Has anyone had experience of or
know any good resources or c/c++ libraries for extracting the
parameters of speech for comparison? Is LPC likely the best algorithm
for this type of work?
Thanks
Ray
|
|
0
|
|
|
|
Reply
|
raeldor (34)
|
11/28/2010 12:45:48 AM |
|
On Nov 28, 1:45=A0pm, Raeldor <rael...@gmail.com> wrote:
> Hi All,
>
> I am trying to find the best way to analyze 2 speech signals from 2
> different speakers and come up with a 'match percentage' as to how
> close they were (ie, did the speakers say the same thing).
>
> I have been reading that LPC is great for encoding the basic
> parameters of speech, but most of the articles are related to
> compression or building of a vocoder. =A0Has anyone had experience of or
> know any good resources or c/c++ libraries for extracting the
> parameters of speech for comparison? =A0Is LPC likely the best algorithm
> for this type of work?
>
> Thanks
> Ray
Try Matlab.
|
|
0
|
|
|
|
Reply
|
HardySpicer
|
11/28/2010 3:08:46 AM
|
|
On Nov 27, 7:08=A0pm, HardySpicer <gyansor...@gmail.com> wrote:
> On Nov 28, 1:45=A0pm, Raeldor <rael...@gmail.com> wrote:
>
> > Hi All,
>
> > I am trying to find the best way to analyze 2 speech signals from 2
> > different speakers and come up with a 'match percentage' as to how
> > close they were (ie, did the speakers say the same thing).
>
> > I have been reading that LPC is great for encoding the basic
> > parameters of speech, but most of the articles are related to
> > compression or building of a vocoder. =A0Has anyone had experience of o=
r
> > know any good resources or c/c++ libraries for extracting the
> > parameters of speech for comparison? =A0Is LPC likely the best algorith=
m
> > for this type of work?
>
> > Thanks
> > Ray
>
> Try Matlab.
Not sure that's an option, as I need to integrate the end results into
a practical application. I have read about Matlab's DSP toolkit, and
it looks really nice, but unless they provide source code or a c/c++
library for the functions it'll be very hard to translate it into my
application.
|
|
0
|
|
|
|
Reply
|
Raeldor
|
11/28/2010 4:33:47 AM
|
|
i was working on filterbank approach to speech recognition.
This project was on isolated word recognition. So a user will
speak a set of words (example 1 to 10) and speak it for 8 times
to perform template averaging.
Templates were computed using filter banks, LPF's and decimation.
The 2D stored templates were then compared with the incoming
word sample to find the match.
I have working c-code which was also ported on C6713 TI dSP platform.
Willing to sell the code if anyone is interested to buy it.
Regards
Bharat
|
|
0
|
|
|
|
Reply
|
bharat
|
11/28/2010 1:09:37 PM
|
|
On Nov 27, 7:45=A0pm, Raeldor <rael...@gmail.com> wrote:
> Hi All,
>
> I am trying to find the best way to analyze 2 speech signals from 2
> different speakers and come up with a 'match percentage' as to how
> close they were (ie, did the speakers say the same thing).
>
> I have been reading that LPC is great for encoding the basic
> parameters of speech, but most of the articles are related to
> compression or building of a vocoder. =A0Has anyone had experience of or
> know any good resources or c/c++ libraries for extracting the
> parameters of speech for comparison? =A0Is LPC likely the best algorithm
> for this type of work?
>
> Thanks
> Ray
You can use either dynamic time warping or hidden markov models. You
may use either the LPC coefs or convert them to PARCORs in either of
these two cost mechanisms.
Clay
|
|
0
|
|
|
|
Reply
|
Clay
|
11/29/2010 9:40:24 PM
|
|
Clay wrote:
> On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote:
>
>>Hi All,
>>
>>I am trying to find the best way to analyze 2 speech signals from 2
>>different speakers and come up with a 'match percentage' as to how
>>close they were (ie, did the speakers say the same thing).
>>
>>I have been reading that LPC is great for encoding the basic
>>parameters of speech, but most of the articles are related to
>>compression or building of a vocoder. Has anyone had experience of or
>>know any good resources or c/c++ libraries for extracting the
>>parameters of speech for comparison? Is LPC likely the best algorithm
>>for this type of work?
>>
>>Thanks
>>Ray
>
>
> You can use either dynamic time warping or hidden markov models. You
> may use either the LPC coefs or convert them to PARCORs in either of
> these two cost mechanisms.
Problem is, there is not much relation between the perceptual similarity
and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are
good functions for pattern matching; LSPs or cepstrum coefficients could
be better in this regard. But, it's all minor technicalities compared to
the incredible problem stated by OP.
Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com
|
|
0
|
|
|
|
Reply
|
Vladimir
|
11/29/2010 11:12:19 PM
|
|
On Nov 29, 3:12=A0pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
> Clay wrote:
> > On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote:
>
> >>Hi All,
>
> >>I am trying to find the best way to analyze 2 speech signals from 2
> >>different speakers and come up with a 'match percentage' as to how
> >>close they were (ie, did the speakers say the same thing).
>
> >>I have been reading that LPC is great for encoding the basic
> >>parameters of speech, but most of the articles are related to
> >>compression or building of a vocoder. =A0Has anyone had experience of o=
r
> >>know any good resources or c/c++ libraries for extracting the
> >>parameters of speech for comparison? =A0Is LPC likely the best algorith=
m
> >>for this type of work?
>
> >>Thanks
> >>Ray
>
> > You can use either dynamic time warping or hidden markov models. You
> > may use either the LPC coefs or convert them to PARCORs in either of
> > these two cost mechanisms.
>
> Problem is, there is not much relation between the perceptual similarity
> and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are
> good functions for pattern matching; LSPs or cepstrum coefficients could
> be better in this regard. But, it's all minor technicalities compared to
> the incredible problem stated by OP.
>
> Vladimir Vassilevsky
> DSP and Mixed Signal Design Consultanthttp://www.abvolt.com
Maybe I should explain where I am at the moment.
I am calculating the FFT of a 128-sample windowed using gaussian
distribution, then converting this to decibels. I can see the peaks
of the formants visually, but there is a lot of noise on the graph (of
the data), I think probably caused by the harmonics. Is there a way
to clean this noise so I can see the formant peaks as smooth peaks in
the graph? The smaller (128-bit) sample size helped with this, as did
the gaussian window, but I can't help but think there is a better
approach for this?
Thanks
Ray
|
|
0
|
|
|
|
Reply
|
Raeldor
|
11/30/2010 12:24:20 AM
|
|
On Nov 29, 3:12=A0pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
> Clay wrote:
> > On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote:
>
> >>Hi All,
>
> >>I am trying to find the best way to analyze 2 speech signals from 2
> >>different speakers and come up with a 'match percentage' as to how
> >>close they were (ie, did the speakers say the same thing).
>
> >>I have been reading that LPC is great for encoding the basic
> >>parameters of speech, but most of the articles are related to
> >>compression or building of a vocoder. =A0Has anyone had experience of o=
r
> >>know any good resources or c/c++ libraries for extracting the
> >>parameters of speech for comparison? =A0Is LPC likely the best algorith=
m
> >>for this type of work?
>
> >>Thanks
> >>Ray
>
> > You can use either dynamic time warping or hidden markov models. You
> > may use either the LPC coefs or convert them to PARCORs in either of
> > these two cost mechanisms.
>
> Problem is, there is not much relation between the perceptual similarity
> and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are
> good functions for pattern matching; LSPs or cepstrum coefficients could
> be better in this regard. But, it's all minor technicalities compared to
> the incredible problem stated by OP.
>
> Vladimir Vassilevsky
> DSP and Mixed Signal Design Consultanthttp://www.abvolt.com
Maybe I should explain where I am at the moment.
I am calculating the FFT of a 128-sample windowed using gaussian
distribution, then converting this to decibels. I can see the peaks
of the formants visually, but there is a lot of noise on the graph
(of
the data), I think probably caused by the harmonics. Is there a way
to clean this noise so I can see the formant peaks as smooth peaks in
the graph? The smaller (128-bit) sample size helped with this, as
did
the gaussian window, but I can't help but think there is a better
approach for this?
Thanks
Ray
|
|
0
|
|
|
|
Reply
|
Raeldor
|
11/30/2010 4:17:47 PM
|
|
Raeldor wrote:
> On Nov 29, 3:12 pm, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
>
>>Clay wrote:
>>
>>>On Nov 27, 7:45 pm, Raeldor <rael...@gmail.com> wrote:
>>
>>>>Hi All,
>>
>>>>I am trying to find the best way to analyze 2 speech signals from 2
>>>>different speakers and come up with a 'match percentage' as to how
>>>>close they were (ie, did the speakers say the same thing).
>>
>>>>I have been reading that LPC is great for encoding the basic
>>>>parameters of speech, but most of the articles are related to
>>>>compression or building of a vocoder. Has anyone had experience of or
>>>>know any good resources or c/c++ libraries for extracting the
>>>>parameters of speech for comparison? Is LPC likely the best algorithm
>>>>for this type of work?
>>
>>>You can use either dynamic time warping or hidden markov models. You
>>>may use either the LPC coefs or convert them to PARCORs in either of
>>>these two cost mechanisms.
>>
>>Problem is, there is not much relation between the perceptual similarity
>>and the similarity of LPCs or ParCors. I.e. neither LPCs nor parcors are
>>good functions for pattern matching; LSPs or cepstrum coefficients could
>>be better in this regard. But, it's all minor technicalities compared to
>>the incredible problem stated by OP.
>>
> Maybe I should explain where I am at the moment.
> I am calculating the FFT of a 128-sample windowed using gaussian
> distribution, then converting this to decibels. I can see the peaks
> of the formants visually, but there is a lot of noise on the graph
> (of
> the data), I think probably caused by the harmonics. Is there a way
> to clean this noise so I can see the formant peaks as smooth peaks in
> the graph? The smaller (128-bit) sample size helped with this, as
> did
> the gaussian window, but I can't help but think there is a better
> approach for this?
Raeldor,
This is business. You can hire my services; contact at the web site. You
can also consider filterbank TMS C67x software offered by Bharat Pathak.
Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com
|
|
0
|
|
|
|
Reply
|
Vladimir
|
11/30/2010 4:53:08 PM
|
|
On Nov 27, 7:45=A0pm, Raeldor <rael...@gmail.com> wrote:
> Hi All,
>
> I am trying to find the best way to analyze 2 speech signals from 2
> different speakers and come up with a 'match percentage' as to how
> close they were (ie, did the speakers say the same thing).
>
> I have been reading that LPC is great for encoding the basic
> parameters of speech, but most of the articles are related to
> compression or building of a vocoder. =A0Has anyone had experience of or
> know any good resources or c/c++ libraries for extracting the
> parameters of speech for comparison? =A0Is LPC likely the best algorithm
> for this type of work?
>
> Thanks
> Ray
Google MFCC, PLP, DP, HMM etc. etc. etc
Go through about 1000 references
Better yet, forget about the whole thing: it's called ASR (automatic
speech recognition)
Google it too...
But, just for starters:
http://cslu.cse.ogi.edu/toolkit/
http://cmusphinx.sourceforge.net/
http://www.isip.piconepress.com/projects/speech/software/
This is all crap anyway...
|
|
0
|
|
|
|
Reply
|
fatalist
|
12/1/2010 3:52:16 PM
|
|
On Nov 30, 11:53=A0am, Vladimir Vassilevsky <nos...@nowhere.com> wrote:
> Raeldor,
>
> This is business. You can hire my services; contact at the web site. You
> can also consider filterbank TMS C67x software offered by Bharat Pathak.
>
> VladimirVassilevsky
> DSP and Mixed Signal Design Consultanthttp://www.abvolt.com
|
|
0
|
|
|
|
Reply
|
ajmcgraw
|
12/2/2010 3:33:20 AM
|
|
On Dec 1, 7:52=A0am, fatalist <simfid...@gmail.com> wrote:
> On Nov 27, 7:45=A0pm, Raeldor <rael...@gmail.com> wrote:
>
> > Hi All,
>
> > I am trying to find the best way to analyze 2 speech signals from 2
> > different speakers and come up with a 'match percentage' as to how
> > close they were (ie, did the speakers say the same thing).
>
> > I have been reading that LPC is great for encoding the basic
> > parameters of speech, but most of the articles are related to
> > compression or building of a vocoder. =A0Has anyone had experience of o=
r
> > know any good resources or c/c++ libraries for extracting the
> > parameters of speech for comparison? =A0Is LPC likely the best algorith=
m
> > for this type of work?
>
> > Thanks
> > Ray
>
> Google MFCC, PLP, DP, HMM etc. etc. etc
>
> Go through about 1000 references
>
> Better yet, forget about the whole thing: it's called ASR (automatic
> speech recognition)
>
> Google it too...
>
> But, just for starters:
>
> http://cslu.cse.ogi.edu/toolkit/http://cmusphinx.sourceforge.net/http://w=
ww.isip.piconepress.com/projects/speech/software/
>
> This is all crap anyway...
Thank you for these links. Looks like there's a lot of good info I
haven't seen yet. I guess having the right terminology helps! :)
|
|
0
|
|
|
|
Reply
|
Raeldor
|
12/10/2010 3:25:37 AM
|
|
|
11 Replies
253 Views
(page loaded in 0.179 seconds)
Similiar Articles: how can i compare two speech signal by using FFT or DCT - comp ...i want to compare two speech signal by using fft or dct..how can i do that,any idea ? ... to acquire the data, but you can ignore that part and look at the analysis method. Voice and Matlab - comp.soft-sys.matlabMel cepstral coefficients - comp.soft-sys.matlab LPC in audio-speech - comp.soft-sys ... two speech signal by using FFT or DCT - comp ... i want to compare two speech signal ... recommended speech analysis tool? - comp.speech.usersIt's free, available for several operating systems, and you can do all sorts of speech analysis (pitch, intensity, formants, fft etc), as well as manipulation ... Mel cepstral coefficients - comp.soft-sys.matlabUnique voice parameters for voice analysis??? - comp.dsp ..... comp.speech > > > > Cheers ... LPC in audio-speech - comp.soft-sys.matlab Mel-Scale Filterbank - comp ... separating speech and music in a sound file - comp.soft-sys.matlab ...The music will have energy distributed into higher frequencies than the speech signal. ... You might want to try using Independent Component Analysis http://www.cnl.salk ... implementing MUSIC/Pisarenko algorithm in Matlab - comp.dsp ...I am following the algorithm from the book "Spectral analysis of signals" by Moses and ... separating speech and music in a sound file - comp.soft-sys.matlab ... ... comp.soft-sys.matlab - page 285Signal Generation 4 3 (12/31/2003 1:26:12 PM) Hello I am trying to ... 12/31/2003 1:31:38 PM) I want tutorials,pdf about speech processing,linear predictive coding method.I ... Mel-Scale Filterbank - comp.soft-sys.matlabMaxf = fs/2; %Maximum linear frequency half of ... cepstral coefficients - comp.soft-sys.matlab LPC in audio-speech ... 5.4 Filterbank Analysis - Welcome to the Department of ... image clustering with hidden Markov Model - comp.soft-sys.matlab ...... sys.matlab ... iam working on image fusion using PCA(principal component analysis ... how can i compare two speech signal by using FFT or DCT - comp ... Speech recognition and ... Regression with complex numbers - comp.soft-sys.matlabNeed to do some sort of regression analysis on complex data. The data is noisey ... imaginary - comp.soft-sys.matlab ..... and expect to have a real-valued signal ... speech ... FPGA board for video processing - comp.arch.fpga... will focue on the architecture design of video signal ... Al, I am looking for low cost boards to do 1. speech processing. 2. ... Spartan 3E I/O Pins -- LPC Bus Interface - comp ... Acoustic archive - comp.speech.usersThe .DRA files appear to record the speech spoken when ... DSP and Mixed Signal Design Consultanthttp://www.abvolt ... Domain, FFT, and Octave (and 1/3 Octave) band analysis ... How to get envelope from AM signal without phase shift - comp.dsp ...>> >> Simulation results show that if you add two signals with ... simplest terminology wrt to wave propagation 2) Do not have the faintest clue about data analysis 3 ... just like half band filters are ther 1/3 band filters too? - comp ...... good for interpolation and decimation by 2 ... Center, Yorktown Heights, NY Acoustics, Speech and Signal ... Domain, FFT, and Octave (and 1/3 Octave) band analysis ... Amplitude Time Ave. w/o Rectification - comp.dspTemperature is a signal. Fourier analysis _originated_ in heat transfer. ... claim the First Amendment only covers the narrowest definitions of "speech ... Speech CompressionLPC Analysis Consider one frame of speech signal: The signal is related to the ... LPC10 (2400 bps) This is speech compressed using the Linear Predictive Coding (LPC10 ... Download Speech Analysis using LPC - Speech Analysis using LPC - A ...Speech Analysis using LPC Speech Analysis using LPC - A GUI for Speech Analysis using LPC ... This GUI is used to analyze the speech signal at the selected region of 256 samples. 7/13/2012 2:05:57 PM
|