Prev: 2nd UNTREF International Congress on Acoustics
Next: CERN. LHC. The time needed to grow is from 1000 seconds to 1000 days. The Earth can already be pregnant now.
From: bharat pathak on 31 Mar 2010 22:29 I am trying to isolate the silence regions from the speech regions. For this I am trying to find the enevelope of speech signal and compare it with set threshold value. My question is what frequencies does a speech envelope typically contain? What are the typical ways in which people perform the isolation of speech signals from the silence regions. Assume in some presence of noise. bharat
From: Jerry Avins on 31 Mar 2010 23:24 On 3/31/2010 10:29 PM, bharat pathak wrote: > I am trying to isolate the silence regions from the speech > regions. For this I am trying to find the enevelope of speech > signal and compare it with set threshold value. My question is > what frequencies does a speech envelope typically contain? > > What are the typical ways in which people perform the isolation > of speech signals from the silence regions. Assume in some > presence of noise. > > bharat What do you mean by the envelope of a baseband signal? That's a new one on me. Jerry -- "It does me no injury for my neighbor to say there are 20 gods, or no God. It neither picks my pocket nor breaks my leg." Thomas Jefferson to the Virginia House of Delegates in 1776. ���������������������������������������������������������������������
From: Rafael Deliano on 1 Apr 2010 02:20 > I am trying to isolate the silence regions from the speech > regions. Typical applications are: a) old speech transmission systems ( TASI ) ; speakerphones ; in modern speech transmission more often called VAD http://en.wikipedia.org/wiki/Voice_activity_detection b) isolated word speech recognition. > the enevelope of speech > signal and compare it with set threshold value. Energy based detection is usable for a) because the human listener will not mind too much about missing short low-energy/high-frequency parts. For b) two signals like energy + zero-crossing-rate are preferred. > Assume in some presence of noise. Makes energy less preferable. If you define the noise as low-frequency ( Hoth-noise ) then zero-crossing would still be usable. MfG JRD
From: HardySpicer on 1 Apr 2010 11:49 On Apr 1, 3:29 pm, "bharat pathak" <bharat(a)n_o_s_p_a_m.arithos.com> wrote: > I am trying to isolate the silence regions from the speech > regions. For this I am trying to find the enevelope of speech > signal and compare it with set threshold value. My question is > what frequencies does a speech envelope typically contain? > > What are the typical ways in which people perform the isolation > of speech signals from the silence regions. Assume in some > presence of noise. > > bharat Hige literature on this one. Voice-activity detectors - look on Google Scholar. Hardy
From: Clay on 1 Apr 2010 14:40
On Mar 31, 10:29 pm, "bharat pathak" <bharat(a)n_o_s_p_a_m.arithos.com> wrote: > I am trying to isolate the silence regions from the speech > regions. For this I am trying to find the enevelope of speech > signal and compare it with set threshold value. My question is > what frequencies does a speech envelope typically contain? > > What are the typical ways in which people perform the isolation > of speech signals from the silence regions. Assume in some > presence of noise. > > bharat POTS (Plain Old Telephone Service) passes just below 300 to just above 3200 Hz, so expect most of your speech to be in that frequency range. Clay |