From: Greg Heath on
On Aug 4, 4:04 am, "Dean " <jiangwei0...(a)gmail.com> wrote:
> Greg Heath <he...(a)alumni.brown.edu> wrote in message <2ba31fe8-f154-4dec-ad32-64cbae041...(a)w12g2000yqj.googlegroups.com>...
> > On Aug 2, 9:07 pm, "Dean " <jiangwei0...(a)gmail.com> wrote:
> > > Hi guys,
>
> > > I understand that aPSDof a white Gaussian noise of is equivalent to its variance. However, I can't even close to get this value using Periodogram routine. I believe I've forget a scale factor somewhere.
>
> > > For example, I've a WGN of variance of 16, I run a very large number of data sequence through Periodogram, and do a linear fit to the spectrum. Although thePSDlooks flat but the value is wrong.
>
> > > According to Parseval's Theorem, the area under the spectrum must equal the variance and therefore equivalent toPSD. I found using numerical integration of the area gives me a better estimation.
>
> > > On the other hand, I expect the variance is also equivalent to the square of the power spectrum. In which I failed to get a good estimation.
>
> > > Anyone knows how to correctly estimate the variance or has any comments on this please let me know.
>
> > > Cheers,
>
> > > Dean
> > > --------------------Code Here------------------------------------------------
> > > N = 100000;
> > > H = spectrum.periodogram;
> > > x = 4*randn(N,1); % wgn with a variance of 16
> > > Py =psd(H,x);
> > > % in dBPSDis flat and equal to 10*log10(16)
> > > plot(Py.Frequencies,10*log10(Py.Data));
> > > % then do a linear fit to check if the
> > > % PSD_hat=10*log10(16), sadly not as I expected.
> > > trapz(Py.Data)/N % this seem works
> > > sumsqr(Py.Data)/N % this one doesn't
>
> > close all, clear all, clc
> > N = 100000; % N = 100 might be better
> > randn('state',0)
> > x = 4*randn(N,1); % wgn with a variance of 16
> > meanx = mean(x) % 0.012829
> > varx = var(x) % 16.118
> > meansqrx = mean(x.^2) % 16.118
>
> > % NOTE: meanx is to small and N is too large to
> > % distinguish between the two
>
> > check = meansqrx-varx % 3.4184e-006
>
> > X = fft(x);
> >PSD = abs(X).^2/N; % Not scaled by (1/Fs)
> > meanPSD = mean(PSD) % 16.118
> > PSDdb = 10*log10(PSD);
> > meanPSDdb = mean(PSDdb) % 9.5829
> > illegal = 10*log10(meanPSD) % 12.073
>
> > Hope this helps.
>
> > Greg
>
> Hi Greg,
>
> My main purpose is to find a relationship between
variance and PeriodogramPSDestimations. The reason
I use very large value N is >because I believe the
> value will converge to its true value.

I made the comment because, frequently, it is easier to
diagnose/debug when N is small; especially if you want
to rummage through printout obtained by removing the
end of command semicolon.

This is obviously convenient when your answer is
invariant to N.

However, it is also conveneient when you think that
the answer is invariant to N when it really isn't;
or vice versa!

In a secenario where the N = 1e5, regardless of
whether you think the answer is invariant to N or not,
it may be still worthwhile to consider looking at
N0 = 10, 2*N0 and 10*N0 for the purpose of diagnostics.

> For example, If I use 100, varx=12.0688 where the
> theoretical value should be 16.

Again, the theoretical value is irrelevant.

In fact, for your problem, even your choice of function,
randn, is irrelevant.

You are trying to verify a conjectured general,
proportional, relationship between variance and
mean(PSD).

The answer is:

There is none!

Running my script for N = 100 yields

N = 100
meanx = 0.19172
varx = 12.069
meanxsq = 11.985
meanPSD = 11.985
meanPSDdb = 7.6326
illegal = 10.786

Therefore, in general,

mean(PSD) = meanxsq ~= varx


>
> I started with WGN because I know its variance and mean
> of PSD as N->infinity.

Obviously, not a good choice.

> I don't know any other "Gold standard", I thought it
> might be wise to start with something I know. (highly
> appreciate if anyone as any input/comments on this topic).

Consider your ultimate goal. In this case you are trying
to verify a general relationship that is function independent.

Therefore, choose a simple function and a low number of
samples. This allows you to scan through short output
listings as well as analyze detailed plots.

> For a WGN with variance = 1, its meanPSD and variance
> are both = 1. This seems doesn't agree with your comments.
> I am sorry for my understanding, and I am grateful for
> your answer.

Now that you have read through my comments I hope
you have a better understanding of

1. Plancheral's Theorem (meanPSD = meanxsq ~= varx)
2. Using small examples for diagnosis/debugging.

Hope this helps.

Greg

P.S. In a lot of examples errors are obscured when
meanx = 0 and/or varx = 1. So, it may be wise to
consider using other values.
From: Greg Heath on
On Aug 4, 7:26 am, "Dean " <jiangwei0...(a)gmail.com> wrote:
> > > Dean
>
> > Hi Dean, keep in mind that the unmodified periodogram
only converges in expected value (the mean) to truePSD as N
goes to infinity. The periodogram is no a consistent estimator
of the true PSD, which means that its variance will not go to
zero. The variance of the periodogram is approximately the
> square of the true PSD.
>
> > Wayne
>
> Hi Wayne,
>
> That's exactly I've been told by textbooks. I'm trying
> my best to verify it trough simulations but didn't succeed.
> Can you suggest a way to verify it?

If I understand Wayne's comments correctly:

1. This IS NOT a question of averaging over a single large
sample of length M where
a. PSD(1) = M*mean(x)^2
b. mean(PSD) = mean(x.^2)
c. mean(PSD(2:end)) = var(x)

2. This IS a question of averaging over a large number (N)
of samples of length M where M is not necessarily large.

3. Ultimately, a convincing verification can be obtained
using a random process with a nonGaussian distribution
and a nonconstant PSD0.

4. Choose a deterministic PSD = PSD0; e.g., Lorentzian or
Gaussian with size(PSD0) = [1 M]

5. Choose a probability distribution function
with, perhaps, mean ~= zero, var ~= one

6. Fill x with N (large) random samples of length M (small)
so that [M N] = size(x)

7. Transpose the fft of x and obtain the corresponding PSD.

8. Compute the summary statistics of PSD by averaging over
the N rows

9. Tabulate, as a function of N, the comparison of the PSD
statistics with the theoretical values obtained from
PSD0

10. Transpose x and compute the summary statistics of x
by averaging over the N rows

11. Tabulate, as a function of N, the comparison of x and
PSD statistics

13. In the demo below, I have
a. ignored the requirement in step 3.
b. been unable to explain why the the variance of the
DC and Nyquist components are twice the theoretical
values in step 9
b. been unable to successfully make comparison of PSD
and x summary statics in step 11

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% SINGLE SAMPLE STATISTICS
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

close all, clear all, clc, p=0
M = 100
randn('state',0)
x = 4*randn(M,1); % wgn with a variance of 16
meanx = mean(x) % 0.19172
varx = var(x) % 12.069
meanxsq = mean(x.^2) % 11.985

check1 = varx - (M/(M-1))*(meanxsq - meanx^2 ) % 5.3291e-015

% NOTE: meanxsq = varx only if meanx = 0 and M is very large

X = fft(x);
PSD = abs(X).^2/M; % Not scaled by (1/Fs)

check2 = PSD(1) - M*(meanx)^2 % 3.9968e-015
check3 = mean(PSD) - meanxsq % -7.1054e-015
check4 = mean(PSD(2:end)) - varx % -1.4211e-014

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% ASYMPTOTIC MULTIPLE SAMPLE STATISTICS
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

M = 20
randn('state',0)
PSD0 = 16*ones(1,M);

for n = 1:4

N = 10^n

x = 4*randn(M,N); % wgn with a variance of 16
X = fft(x)'; % Note the transpose
PSD = abs(X).^2/M; % Not scaled by (1/Fs)
meanPSD = mean(PSD);
varPSD = var(PSD);

p=p+1,figure(p),hold on
subplot(211), hold on
plot(PSD0,'k')
plot(meanPSD,'o')
axis([0 M+1 0 40])
legend('PSD0','meanPSD')
title(['Asymptotic PSD Statistics for WGN ( N = ',...
num2str(N),' )'])
subplot(212), hold on
plot(PSD0.^2,'k')
plot(varPSD,'o')
plot(meanPSD.^2,'r*')
axis([0 M+1 0 700])
legend('PSD0^2','varPSD','meanPSD^2')
end

x = x'; % Note the transpose
meanx = mean(x);
meanxsq = meanx.^2;
meansqx = mean(x.^2);
varx = var(x);

check1 = minmax(varx - (N/(N-1))*(meansqx - meanxsq ))
% [-9.4147e-014 9.9476e-014]

p=p+1,figure(p),hold on

plot(meanxsq,'o')
plot(meansqx,'g*')
plot(varx,'ro')
axis([0 M+1 -10 30])
legend('mean(x)^2','mean(x^2)','var(x)')
title(['Asymptotic x Statistics for WGN ( N = ',...
num2str(N),' )'])

for j=p:-1:1
figure(j)
end
return

Still need to

1. successfully compare PSD and x statistics.
2. use a nonGaussian distribution with a nonconstant PSD0

Hope this helps.

Greg

From: Greg Heath on
On Aug 4, 7:00 am, "Wayne King" <wmkin...(a)gmail.com> wrote:
> "Dean " <jiangwei0...(a)gmail.com> wrote in message <i3b6tk$1a...(a)fred.mathworks.com>...
> >GregHeath<he...(a)alumni.brown.edu> wrote in message <2ba31fe8-f154-4dec-ad32-64cbae041...(a)w12g2000yqj.googlegroups.com>...
> > > On Aug 2, 9:07 pm, "Dean " <jiangwei0...(a)gmail.com> wrote:
> > > > Hi guys,
>
> > > > I understand that a PSD of a white Gaussian noise of is equivalent to its variance. However, I can't even close to get this value using Periodogram routine. I believe I've forget a scale factor somewhere.
>
> > > > For example, I've a WGN of variance of 16, I run a very large number of data sequence through Periodogram, and do a linear fit to the spectrum.. Although the PSD looks flat but the value is wrong.
>
> > > > According to Parseval's Theorem, the area under the spectrum must equal the variance and therefore equivalent to PSD. I found using numerical integration of the area gives me a better estimation.
>
> > > > On the other hand, I expect the variance is also equivalent to the square of the power spectrum. In which I failed to get a good estimation.
>
> > > > Anyone knows how to correctly estimate the variance or has any comments on this please let me know.
>
> > > > Cheers,
>
> > > > Dean
> > > > --------------------Code Here------------------------------------------------
> > > > N = 100000;
> > > > H = spectrum.periodogram;
> > > > x = 4*randn(N,1); % wgn with a variance of 16
> > > > Py = psd(H,x);
> > > > % in dB PSD is flat and equal to 10*log10(16)
> > > > plot(Py.Frequencies,10*log10(Py.Data));
> > > > % then do a linear fit to check if the PSD_hat=10*log10(16), sadly not as I expected.
> > > > trapz(Py.Data)/N % this seem works
> > > > sumsqr(Py.Data)/N % this one doesn't
>
> > > close all, clear all, clc
> > > N = 100000; % N = 100 might be better
> > > randn('state',0)
> > > x = 4*randn(N,1); % wgn with a variance of 16
> > > meanx = mean(x) % 0.012829
> > > varx = var(x) % 16.118
> > > meansqrx = mean(x.^2) % 16.118
>
> > > % NOTE: meanx is to small and N is too large to
> > > % distinguish between the two
>
> > > check = meansqrx-varx % 3.4184e-006
>
> > > X = fft(x);
> > > PSD = abs(X).^2/N; % Not scaled by (1/Fs)
> > > meanPSD = mean(PSD) % 16.118
> > > PSDdb = 10*log10(PSD);
> > > meanPSDdb = mean(PSDdb) % 9.5829
> > > illegal = 10*log10(meanPSD) % 12.073
>
> > > Hope this helps.
>
> > >Greg
>
> > HiGreg,
>
> > My main purpose is to find a relationship between variance and Periodogram PSD estimations. The reason I use very large value N is because I believe the value will converge to its true value.
>
> > For example, If I use 100, varx=12.0688 where the theoretical value should be 16.
>
> > I started with WGN because I know its variance and mean of PSD as N->infinity. I don't know any other "Gold standard", I thought it might be wise to start with something I know. (highly appreciate if anyone as any input/comments on this topic).
>
> > For a WGN with variance = 1, its mean PSD and variance are both = 1.. This seems doesn't agree with your comments.
>
> > I am sorry for my understanding, and I am grateful for your answer.
>
> > Dean
>
> Hi Dean, keep in mind that the unmodified periodogram only converges in expected value (the mean) to true PSD as N goes to infinity. The periodogram is no a consistent estimator of the true PSD, which means that its variance will not go to zero. The variance of the periodogram is approximately the square of the true PSD.
>

Hi Wayne,

Is this true for a general random process or are there
caveats? e.g., zero mean, no deterministic trend, etc

If you run the code in my previous post you
will see that I get twice the theoretical
value for varPSD at DC and the Nyquist frequency.

Do you have any idea why?

Greg