From: Beatrice P?schel on
hi

I was using the following loop to calculate coherence

for i=1:16
x=filter_Michigan_a(refe_ch,:);
y=filter_Michigan_a(i,:);
window=500;
noverlap=0;
nfft=length(x);
[Cxy,f]=mscohere(x,y,window,noverlap,nfft,fq);
cohere_band1(i)=max(Cxy(3:5));
cohere_band2(i)=max(Cxy(6:8));
cohere_band3(i)=max(Cxy(9:16));
cohere_band4(i)=max(Cxy(17:41));
end

after testing i was not sure if i get correct values for f. i was advised to redefine nfft

nfft=fq (istead of nfft=length(x))

in both cases the rage of f is [0, fq/2] as predicted by the mscohere function but i get a longer vector for nfft=length(x)
i dont know how to decide which version is correct or if both definitions for nfft are wrong.
any suggestion/ help would be greatly appreciated.
thank you.
Beatrice
From: Bas on
Hi,

For the accurate calculation of coherence, you need a lot of averages.
You thus want to obtain as much data as possible. If the length of
your data is limited, this means that you have to use pretty short
windows, which means that your frequency resolution will be pretty
low. Setting nfft=length(x) is wrong, in this way you only use a
single window, which gives no valid results. Nfft = fq seems also
strange, since the first is a number of samples, while the second is a
frequency. This would mean that you take window lengths of 1 second,
but I am not sure if that is what you want. Further, what you usually
do is use a Hann or Hamming window, which means that the data at both
extremes of the window are weighted really low. To overcome this
problem and use all the information in your data, it is common to use
a Hanning window and overlap the windows by 50%. Finally, it is common
to set nfft to a powers of 2 for fast computations, but this is
probably not absolutely necessary.

The way I always use mscohere:
nfft = 2^n
[coh, f] = mscohere(x, y, hanning(nfft), nfft/2, nfft, fsample);
but since nfft defaults to the length of the window and noverlap to
half of that, you can also write
[coh, f] = mscohere(x, y, hanning(nfft), [], [], fsample);
or if you dont care a lot about the difference between hanning and
hamming, simply
[coh, f] = mscohere(x, y, nfft, [], [], fsample);
But do read the help of mscohere to check.

The length of the window is then twindow = nfft / fsample
which gives you a frequency resolution of fres = 1/twindow = fsample /
nfft
so you will get a frequency vector of f = 0:fres:(fsample/2), which
thus has a length(f) = nfft/2 + 1
Sinc you overlap by 50%, the number of averages is roughly naverage =
2 * tdata / twindow, where tdata is the length of your data in
seconds. Based on the value of the coherence you expect, you should
choose your number of averages, wich can be very high if you want to
find a low coherence between the x and y signals. It is all a trade-
off between frequency resolution, accuracy of the result (better with
more averages) and the required amount of data.

Finally, I guess from your code that you want to compute the coherence
in certain frequency bands. Note that you hard-coded the indices of
these bands, but obviously these indices will change if you play with
nfft.

HTH,
Bas
From: Beatrice P?schel on
Hi Bas

thanks so much for your help. the mscohere funktion is much clearer for me know. you write nfft = 2^n. i was wondering how n is defined or how you define nfft if you dont want to set nfft to a powers of 2?

thank you,
Beatrice
From: Bas on
On Nov 13, 6:05 pm, "Beatrice P?schel" <beatrice.poesc...(a)gmx.de>
wrote:
> thanks so much for your help. the mscohere funktion is much clearer for me know. you write nfft = 2^n. i was wondering how n is defined or how you define nfft if you dont want to set nfft to a powers of 2?

Simple, based on the formulas I gave before you should be able to
figure out how big you want nfft to be approximately. If you care care
about speed, you then do
nfft = 2^round(log2(nfft_target))
to get a power of 2 that is close. If speed is not an issue, just give
nfft any value you want, but try to avoid values that contain large
prime factors. See help fft for the reason why, but it is basically a
detail of the fft algorithm.

Some more information about choosing the number of averages: I know
from experience that if you have a coherence of some 10%, you need
around 100 averages to be certain that you are not looking at
statistical fluctuations of two uncorrelated signals. I guess that if
you expect a coherence with a value C, you need about 1/C^2 averages,
but I do not know if that is theoretically correct so the relation
might even be exponential. Also note that results are biased towards
positive values if you do not average enough. Even for two signals
that are totally uncorrelated, you will always compute a positive
value (by definition), which converges pretty slowly to zero as you
increase the number of averages. But what is your application? Do you
just want to check if some signal is dominated by some other noisy
channel at some frequencies (this is a something we do daily, you just
take some tens of averages), or do you want to know accurately that
some noise couples with about 1% to another signal (which we need to
know sometimes, in which case we use half a day worth of data sampled
at 10kHz)? Play with your data to see what works or use simulated data
if you want to be sure.

Bas