Speech Signal Separation for Voice/Unvoice parts [Matlab]

Prev: ??? Input argument "x" is undefined.
Next: xpc target from file

From: Slava on 1 Dec 2009 14:23

Hello, all.
As a part of my Degree project, I need to use Matlab for voice/unvoice separation of the recorded speech file.
For now I can record a file in Matlab, and calculate an Average Zerro-Crossing, Autocorolation and Short-time Energy.
But afterward I need to "show" or to separate recorded signal to "voiced" and "unvoiced" parts.
I understand that I need some how perform a comparison of zerro crossing and energy calculation vectors....

Please assist.

Thank you in advance.
---------
Slava

From: yogesh angal on 17 Dec 2009 15:28

"Slava " <slava81(a)gmail.com> wrote in message <hf3qel$k7b$1(a)fred.mathworks.com>...
> Hello, all.
> As a part of my Degree project, I need to use Matlab for voice/unvoice separation of the recorded speech file.
> For now I can record a file in Matlab, and calculate an Average Zerro-Crossing, Autocorolation and Short-time Energy.
> But afterward I need to "show" or to separate recorded signal to "voiced" and "unvoiced" parts.
> I understand that I need some how perform a comparison of zerro crossing and energy calculation vectors....
>
> Please assist.
>
> Thank you in advance.
> > Slava
Dear Slava,
RONALD W. SCHAFER and LAWRENCE R. WBINER"Digital Representations of Speech Signals"PROCEEDINGS OF THE IEEE, VOL. 63, NO. 4, APRIL 1975
1)The major significance of E(n) is that it provides a good measurefor separating voiced speech segments from unvoiced speech segments. E(n) for unvoiced segments is much
smaller than for voiced segments
2)It is well known that the energy of voiced speech tends to be concentrated below 3 kHz, whereas the energy of fricatives generally is concentrated above 3 kHz.Thus, zero crossing measurements (along with energy information)are often used in making a decision about whether a particular segment of speech is voiced or unvoiced. If the zero
crossing rate is high, the implication is unvoiced; if the zero crossing rate is low, the segment is most likely to be voiced.

Yogesh S Angal

From: Slava on 18 Dec 2009 03:04

> Dear Slava,
> RONALD W. SCHAFER and LAWRENCE R. WBINER"Digital Representations of Speech Signals"PROCEEDINGS OF THE IEEE, VOL. 63, NO. 4, APRIL 1975
> 1)The major significance of E(n) is that it provides a good measurefor separating voiced speech segments from unvoiced speech segments. E(n) for unvoiced segments is much
> smaller than for voiced segments
> 2)It is well known that the energy of voiced speech tends to be concentrated below 3 kHz, whereas the energy of fricatives generally is concentrated above 3 kHz.Thus, zero crossing measurements (along with energy information)are often used in making a decision about whether a particular segment of speech is voiced or unvoiced. If the zero
> crossing rate is high, the implication is unvoiced; if the zero crossing rate is low, the segment is most likely to be voiced.
>
> Yogesh S Angal

Thank you for your answer, Yogesh.
But, how can I perform this separation in matlab?
I do know how to record a signal and how to check it for Average Energy and Average Zero cross.
And afterwards I have 2 vectors:
1 = Average Energy of recorded signal.
2 = Average Zero crossing of recorded signal.
What procedures I need to do, in order to separate (and plot, of course) my recorded voice file for "voiced" and "unvoiced" parts.
I need may be some how to compare these vectors? Divide them?

Thank you.

From: yogesh angal on 18 Dec 2009 14:48

"Slava " <slava81(a)gmail.com> wrote in message <hgfd1k$j86$1(a)fred.mathworks.com>...
>
> > Dear Slava,
> > RONALD W. SCHAFER and LAWRENCE R. WBINER"Digital Representations of Speech Signals"PROCEEDINGS OF THE IEEE, VOL. 63, NO. 4, APRIL 1975
> > 1)The major significance of E(n) is that it provides a good measurefor separating voiced speech segments from unvoiced speech segments. E(n) for unvoiced segments is much
> > smaller than for voiced segments
> > 2)It is well known that the energy of voiced speech tends to be concentrated below 3 kHz, whereas the energy of fricatives generally is concentrated above 3 kHz.Thus, zero crossing measurements (along with energy information)are often used in making a decision about whether a particular segment of speech is voiced or unvoiced. If the zero
> > crossing rate is high, the implication is unvoiced; if the zero crossing rate is low, the segment is most likely to be voiced.
> >
> > Yogesh S Angal
>
>
> Thank you for your answer, Yogesh.
> But, how can I perform this separation in matlab?
> I do know how to record a signal and how to check it for Average Energy and Average Zero cross.
> And afterwards I have 2 vectors:
> 1 = Average Energy of recorded signal.
> 2 = Average Zero crossing of recorded signal.
> What procedures I need to do, in order to separate (and plot, of course) my recorded voice file for "voiced" and "unvoiced" parts.
> I need may be some how to compare these vectors? Divide them?
>
>
> Thank you.
Dear Slava,
In the frame-by-frame processing stage, the speech signal is segmented into a non-overlapping frame of samples. It is processed into frame by frame until the entire speech signal is covered. Prepare a Table which includes the Energy and Zero crossing for each frame then decison will be taken voiced/unvoiced decisions for the signal under consideration e.g signal has 3600 samples with 8000Hz sampling rate. At the
beginning, we set the frame size as 400 samples. At the end of the algorithm if the decision is not clear,energy and zero-crossing rate is recalculated by dividing the related frame size into two frames.
This will give the correrct information about voiced and Unvoiced part of th signal.

Ask me if any problem
Yogesh

|
Pages: 1
Prev: ??? Input argument "x" is undefined.
Next: xpc target from file