Need advise on getaudiodata [Is for wav file frequency /pitch [Matlab]

Prev: how to get value of peak index from fft directly into excel file
Next: Parallel Computing Toolbox 4.2 support for lsqcurvefit

From: st on 31 Jul 2010 12:07

Thank u Mr. Walter,

Actually i am working on a project to extract all the frequency of the wav file and display the pitching accordingly to the time...

If there any more efficient approach to do this rather than get the sampled data at a time?

the flow of my idea is

1) wavin the .wav file to get the data of the audio file
2) sample the wav file at 441000kHz (1046.50 Hz, is considered HIGH pitch for human voice.. sampling rate at 441000kHz does obey the Nyquist Theorem)
3) retrieve the sampled data (i have no idea where the data stored and how to retrieve them ONE BY ONE)
4) using IF_ELSE statement to compare the sampled data and return the pitch / Musical notation of the sampled data.

Please advise.

From: Walter Roberson on 31 Jul 2010 21:57

st wrote:

> Actually i am working on a project to extract all the frequency of the
> wav file and display the pitching accordingly to the time...
> If there any more efficient approach to do this rather than get the
> sampled data at a time?

If it fits into memory, then reading the entire file at once would be
the most efficient. Reading from disk is one of the slowest of
operations, probably more than 100 times slower than a calculation, so
you want to do as few reads as you can and read as much at one time as
you can afford to (in terms of memory or in terms of response time if
you were doing "real time" work.)

> the flow of my idea is
> 1) wavin the .wav file to get the data of the audio file

wavin() appears to be part of a commercial product for about $CDN 300.
It is not _necessary_ as you can use wavread(), but in some cases the
extra efficiency might be well worth it.

> 2) sample the wav file at 441000kHz (1046.50 Hz, is considered HIGH
> pitch for human voice.. sampling rate at 441000kHz does obey the Nyquist
> Theorem)

Do you mean resample? Resampling cannot add new information (only
potentially lose information), and the process of identifying particular
frequencies would be the same whether you are using 44.1 KHz or whatever
rate the .wav was recorded at. It is not clear to me that this would
gain you anything.

> 3) retrieve the sampled data (i have no idea where the data stored and
> how to retrieve them ONE BY ONE)

That will depend how you do the resampling. Most techniques tend to
return the resampled values as an array.

> 4) using IF_ELSE statement to compare the sampled data and return the
> pitch / Musical notation of the sampled data.

There's a big jump between step 3 and 4. Resampling at a particular
frequency is simply going to get you a set of data samples intended for
that playback frequency, and that does not inherently give you any
information about the pitches. Finding pitches by comparison to the
sampled data is probably not going to work very well, especially when
there is a mix of pitches (as there almost always is for human voice.)

You need to do more research on pitch extraction. Some of us could give
you off-hand clues, but you would be much better off using something
like Google Scholar to look for algorithms.

From: st on 3 Aug 2010 05:48

Thank you Mr. Walter on your advise for my problem. Regarding to below,

1) For my current stage, i wish to extract the pitch from a recorded audio file (ONE PERSON SINGING). If this stage done, i will only pursue for "real time" wok. If i have defined a function that made to returns the pitch value , is it efficient to work the data of the audio data by calling the function ONE-BY-ONE? This is the brief idea how i determine the pitch of data(audio frequency), eg, C4-Do, D4-Re and so on.

2) Regarding on the wavread, i have tried to work on it. But i encountered some situation here. I have 2 .wav files (a.wav and b.wav).
(i) a.wav is an audio file i download from internet and function waveread works fine with it.
(ii) b.wav is an audio file that i converted from b.wma. I record b.wma with Windows recorder. But wavread function doesn't work with b.wma.
I have noticed that the attributes for a.wav is A but b.wav is AN.
- Shall i know what is the root that cause this problem?
- Does the differences of attributes cause this problem?

3) I would think of resample the audio file because i have forgotten that the audio file itself already stored the audio frequency as data. I tot to perform sampling again in order get the data(frequency) of the audio. Thanks for remind my wrong perception. So, i am clear with that my work for now is how to access the data(frequency) of the audio. I will work on it 1st.

4) Thanks for your suggestion. Google Scholar. I will look for the algorithm to extract the pitch of audio but not the playback frequency. Though it might not well to compare sampled data with pitches, i still wish to try it out.

Your advise does gave me an brief clue to start my project! I appreciate for your reply.

|
Pages: 1
Prev: how to get value of peak index from fft directly into excel file
Next: Parallel Computing Toolbox 4.2 support for lsqcurvefit