From: Amir on
"Jan Simon" <matlab.THIS_YEAR(a)nMINUSsimon.de> wrote in message <i17t0k$6ic$1(a)fred.mathworks.com>...
> Dear Amir,
>
> > There can be exactly the same contents, but the point is to find a word or group of words which is the short audio file and which appear in a larger audio file.
>
> Again: Then the shorter sound is *exactly* cut out from the larger one? If is is only the "same word" is a hard task.
> It is time to explain your problem with some details.
>
> Jan

Ok thanks, initially it would be good to solve the problem when the shorter WAV file is cut out and it may appear more than once in the larger one. I think when this is done, to solve the problem of "same word" we will need some extra comparer algorithms with kind for percentage and things like this. What's your opinion?
From: Jan Simon on
Dear Amir,

> Ok thanks, initially it would be good to solve the problem when the shorter WAV file is cut out and it may appear more than once in the larger one.

Load the WAV files with WAVREAD and let STRFIND search in the resulting double vectors. Surprisingly STRFIND is not limited to strings.

> I think when this is done, to solve the problem of "same word" we will need some extra comparer algorithms with kind for percentage and things like this.

And this will be an enormous task. The signals can differ in an infinite number of properties! They can have noise, different length, different pitch, different modulations, the first half can be faster, while the 2nd half is slower, etc.
Speech recognition is a wide field.

Kind regards, Jan
From: Image Analyst on
"Amir " <newsreader(a)mathworks.com> wrote in message
> Ok thanks, initially it would be good to solve the problem when the shorter WAV file is cut out and it may appear more than once in the larger one. I think when this is done, to solve the problem of "same word" we will need some extra comparer algorithms with kind for percentage and things like this. What's your opinion?
-----------------------------------------------------------------------------------
I think you've way way underestimated the problem. Finding an EXACT match is trivial - just look for a stretch of exactly the same numbers - but finding words that are similar...well you're looking at writing a speech recognition program and this is not some straightforward extension of the simple case. Some companies have been in business years doing this, perfecting this (as much as they can). It's not trivial. Not in my opinion.
From: Amir on
"Jan Simon" <matlab.THIS_YEAR(a)nMINUSsimon.de> wrote in message <i185vf$124$1(a)fred.mathworks.com>...
> Dear Amir,
>
> > Ok thanks, initially it would be good to solve the problem when the shorter WAV file is cut out and it may appear more than once in the larger one.
>
> Load the WAV files with WAVREAD and let STRFIND search in the resulting double vectors. Surprisingly STRFIND is not limited to strings.
>
-------
Ok I will try this, thanks.
--------
> > I think when this is done, to solve the problem of "same word" we will need some extra comparer algorithms with kind for percentage and things like this.
>
> And this will be an enormous task. The signals can differ in an infinite number of properties! They can have noise, different length, different pitch, different modulations, the first half can be faster, while the 2nd half is slower, etc.
> Speech recognition is a wide field.
>
> Kind regards, Jan
----
Yes, you're right. I thought we can get such algorithms, but I agree with you that this is a speech recognition problem and it is very big deal.
From: Amir on
"Image Analyst" <imageanalyst(a)mailinator.com> wrote in message <i1888k$gps$1(a)fred.mathworks.com>...
> "Amir " <newsreader(a)mathworks.com> wrote in message
> > Ok thanks, initially it would be good to solve the problem when the shorter WAV file is cut out and it may appear more than once in the larger one. I think when this is done, to solve the problem of "same word" we will need some extra comparer algorithms with kind for percentage and things like this. What's your opinion?
> -----------------------------------------------------------------------------------
> I think you've way way underestimated the problem. Finding an EXACT match is trivial - just look for a stretch of exactly the same numbers - but finding words that are similar...well you're looking at writing a speech recognition program and this is not some straightforward extension of the simple case. Some companies have been in business years doing this, perfecting this (as much as they can). It's not trivial. Not in my opinion.
-------------------------------------
I know that this is a hard task, but I thought maybe from that simple matching we can extend it to the hard one using some algorithms with percentage. However, can you help me with any link or literature, where I can learn more about audio processing and problems like that one above. Are there any audio libraries in MATLAB or C#, which can be useful for me.

Thank you for your time and opinion.