what frequencies does a speech envelope contain? [DSP]

Prev: 2nd UNTREF International Congress on Acoustics
Next: CERN. LHC. The time needed to grow is from 1000 seconds to 1000 days. The Earth can already be pregnant now.

From: bharat pathak on 31 Mar 2010 22:29

I am trying to isolate the silence regions from the speech
regions. For this I am trying to find the enevelope of speech
signal and compare it with set threshold value. My question is
what frequencies does a speech envelope typically contain?

What are the typical ways in which people perform the isolation
of speech signals from the silence regions. Assume in some
presence of noise.

bharat

From: Jerry Avins on 31 Mar 2010 23:24

On 3/31/2010 10:29 PM, bharat pathak wrote:
> I am trying to isolate the silence regions from the speech
> regions. For this I am trying to find the enevelope of speech
> signal and compare it with set threshold value. My question is
> what frequencies does a speech envelope typically contain?
>
> What are the typical ways in which people perform the isolation
> of speech signals from the silence regions. Assume in some
> presence of noise.
>
> bharat

What do you mean by the envelope of a baseband signal? That's a new one
on me.

Jerry
--
"It does me no injury for my neighbor to say there are 20 gods, or no
God. It neither picks my pocket nor breaks my leg."
Thomas Jefferson to the Virginia House of Delegates in 1776.
��

From: Rafael Deliano on 1 Apr 2010 02:20

> I am trying to isolate the silence regions from the speech
> regions.

Typical applications are:
a) old speech transmission systems ( TASI ) ;
speakerphones ; in modern speech transmission more
often called VAD
http://en.wikipedia.org/wiki/Voice_activity_detection
b) isolated word speech recognition.

> the enevelope of speech
> signal and compare it with set threshold value.

Energy based detection is usable for a) because the human
listener will not mind too much about missing short
low-energy/high-frequency parts.
For b) two signals like energy + zero-crossing-rate
are preferred.

> Assume in some presence of noise.
Makes energy less preferable. If you define the noise as
low-frequency ( Hoth-noise ) then zero-crossing would
still be usable.

MfG JRD

From: HardySpicer on 1 Apr 2010 11:49

On Apr 1, 3:29 pm, "bharat pathak" <bharat(a)n_o_s_p_a_m.arithos.com>
wrote:
> I am trying to isolate the silence regions from the speech
> regions. For this I am trying to find the enevelope of speech
> signal and compare it with set threshold value. My question is
> what frequencies does a speech envelope typically contain?
>
> What are the typical ways in which people perform the isolation
> of speech signals from the silence regions. Assume in some
> presence of noise.
>
> bharat

Hige literature on this one. Voice-activity detectors - look on Google
Scholar.

Hardy

From: Clay on 1 Apr 2010 14:40

On Mar 31, 10:29 pm, "bharat pathak" <bharat(a)n_o_s_p_a_m.arithos.com>
wrote:
> I am trying to isolate the silence regions from the speech
> regions. For this I am trying to find the enevelope of speech
> signal and compare it with set threshold value. My question is
> what frequencies does a speech envelope typically contain?
>
> What are the typical ways in which people perform the isolation
> of speech signals from the silence regions. Assume in some
> presence of noise.
>
> bharat

POTS (Plain Old Telephone Service) passes just below 300 to just above
3200 Hz, so expect most of your speech to be in that frequency range.

Clay

| Next | Last
Pages: 1 2
Prev: 2nd UNTREF International Congress on Acoustics
Next: CERN. LHC. The time needed to grow is from 1000 seconds to 1000 days. The Earth can already be pregnant now.