From: Matt J on 27 Mar 2010 09:25 Doug Schwarz <see(a)sig.for.address.edu> wrote in message <see-3B9FF0.08442127032010(a)news.frontiernet.net>... > Amazingly, The MathWorks has anticipated your need and has provided a > function to compute exactly the first element of the DFT. It is very > fast. Use it like this: > > X0 = sum(x); ======= Well, yes, but in fairness, the OP offered that only as an example. For me, until I understand better the advantages of the Goertzel algorithm, the thing to do would be an explicit DFT computation Result=exp(j*(-2*pi/N)*freqs(:)*(0:n-1))*X; where freqs is a vector of zero-based frequency indices.
From: Bruno Luong on 27 Mar 2010 10:14 "Matt J " <mattjacREMOVE(a)THISieee.spam> wrote in message <hokvm6$3u6$1(a)fred.mathworks.com>... > This must be something from the Signal Processing Toolbox. I have no doc on a GOERTZEL command, at any rate. > > I've never encountered the Goertzel algorithm before, although from this wiki article, it has obviously been around for a long time, > > http://en.wikipedia.org/wiki/Goertzel > > However, I can't really see what advantages it conveys over direct evaluation of the DFT. The computational complexity, as discussed in teh article, seems to be the same for both. I must agree with Matt here. Goertzel's is probably intesresting for realtime system. But on Intel processor, it can be beat with simple C program as showed below (Vista 32, 2010A, 2-year-old laptop) N = 1e6; data=rand(1,N)+1i*rand(1,N); k = 1000; % fft component tic y1=goertzel(data,k); t1=toc % Elapsed time is 0.0280 seconds. tic N = length(data); a=exp(-1j*(2*pi*(k-1)/N)); v(1:N) = a; v(1) = 1; v = cumprod(v); y2 = data*v(:); t2=toc % Elapsed time is 0.0752 seconds. tic N = length(data); a = exp(-1j*(2*pi*(k-1)/N)); y3 = get1fft(data, a); % FEX t3=toc % Elapsed time is 0.0118 seconds. /* MEXFILE get1fft.C */ #include "mex.h" #define X prhs[0] #define A prhs[1] #define S plhs[0] void mexFunction(int nlhs, mxArray *plhs[],int nrhs, const mxArray *prhs[]) { mwIndex N, n; double ar, ai, anr, ani, tmp; double sr, si; double *xr, *xi, *xrn, *xin; mwSize dim[2] = {1, 1}; N = mxGetN(X); ar = *mxGetPr(A); ai = *mxGetPi(A); xr = mxGetPr(X); xi = mxGetPi(X); anr = 1; ani = 0; sr = si = 0; xrn = xr; if (xi==NULL) { for (n=0; n<N; n++) { sr += anr*(*xrn); si += ani*(*xrn); tmp = anr*ar-ani*ai; ani = anr*ai+ani*ar; anr = tmp; xrn++; } } else { xin = xi; for (n=0; n<N; n++) { sr += anr*(*xrn)-ani*(*xin); si += ani*(*xrn)+anr*(*xin); tmp = anr*ar-ani*ai; ani = anr*ai+ani*ar; anr = tmp; xrn++; xin++; } } S = mxCreateNumericArray(2, dim, mxDOUBLE_CLASS, mxCOMPLEX); *mxGetPr(S) = sr; *mxGetPi(S) = si; return; } % Bruno
From: Bruno Luong on 27 Mar 2010 10:45 BTW thank you Wayne for the head up of goertzel's algorithm. I have leaned something new. Bruno
From: Wayne King on 27 Mar 2010 10:54 "Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <hol5m2$rr0$1(a)fred.mathworks.com>... > BTW thank you Wayne for the head up of goertzel's algorithm. I have leaned something new. > > Bruno No problem :) I thought that maybe in its second-order implementation, the Goertzel algorithm might help, but I see that the computational savings are not what I had thought. Wayne
From: Lei on 28 Mar 2010 02:11 "Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <hol3rd$2h8$1(a)fred.mathworks.com>... > "Matt J " <mattjacREMOVE(a)THISieee.spam> wrote in message <hokvm6$3u6$1(a)fred.mathworks.com>... > > > This must be something from the Signal Processing Toolbox. I have no doc on a GOERTZEL command, at any rate. > > > > I've never encountered the Goertzel algorithm before, although from this wiki article, it has obviously been around for a long time, > > > > http://en.wikipedia.org/wiki/Goertzel > > > > However, I can't really see what advantages it conveys over direct evaluation of the DFT. The computational complexity, as discussed in teh article, seems to be the same for both. > > I must agree with Matt here. Goertzel's is probably intesresting for realtime system. But on Intel processor, it can be beat with simple C program as showed below (Vista 32, 2010A, 2-year-old laptop) > > N = 1e6; > data=rand(1,N)+1i*rand(1,N); > k = 1000; % fft component > > tic > y1=goertzel(data,k); > t1=toc % Elapsed time is 0.0280 seconds. > > tic > N = length(data); > a=exp(-1j*(2*pi*(k-1)/N)); > v(1:N) = a; > v(1) = 1; > v = cumprod(v); > y2 = data*v(:); > t2=toc % Elapsed time is 0.0752 seconds. > > > tic > N = length(data); > a = exp(-1j*(2*pi*(k-1)/N)); > y3 = get1fft(data, a); % FEX > t3=toc % Elapsed time is 0.0118 seconds. > > > /* MEXFILE get1fft.C */ > #include "mex.h" > > #define X prhs[0] > #define A prhs[1] > #define S plhs[0] > void mexFunction(int nlhs, mxArray *plhs[],int nrhs, const mxArray *prhs[]) > { > mwIndex N, n; > double ar, ai, anr, ani, tmp; > double sr, si; > double *xr, *xi, *xrn, *xin; > mwSize dim[2] = {1, 1}; > > N = mxGetN(X); > ar = *mxGetPr(A); > ai = *mxGetPi(A); > xr = mxGetPr(X); > xi = mxGetPi(X); > anr = 1; > ani = 0; > > sr = si = 0; > xrn = xr; > if (xi==NULL) > { > for (n=0; n<N; n++) > { > sr += anr*(*xrn); > si += ani*(*xrn); > tmp = anr*ar-ani*ai; > ani = anr*ai+ani*ar; > anr = tmp; > xrn++; > } > } else > { > xin = xi; > for (n=0; n<N; n++) > { > sr += anr*(*xrn)-ani*(*xin); > si += ani*(*xrn)+anr*(*xin); > tmp = anr*ar-ani*ai; > ani = anr*ai+ani*ar; > anr = tmp; > xrn++; > xin++; > } > } > > > S = mxCreateNumericArray(2, dim, mxDOUBLE_CLASS, mxCOMPLEX); > *mxGetPr(S) = sr; > *mxGetPi(S) = si; > > return; > } > > % Bruno Dear Bruno, Thank you very much for your guidance. The Mexfunction approach is absolutely an brillant approach, actually I am just about to use this approach to optimize my code. I have conducted a few tests on the algorithm proposed by you, and it has huge advantages over the build-in "fft" when there are large numbers of input points (e.g. N = 2^20). However in my application, I only need to calculate 1024-point partial FFTs. The sample points are 256, and the other input points are zeros. I did test the computational speed of this algorithm, as is shown below: clear N = 1024; data1=rand(1,256)+1i*rand(1,256); data2=zeros(1,N-256); data = [data1 data2]; k = 2; % fft component L = length(data); a = exp(-1j*(2*pi*(k-1)/L)); tic for lop = 1:1000 %y1 = p_ifft(data,N,1); %y2 = pfft_lei(data); y1 = get1fft(data, a); % FEX end t1=toc % Elapsed time is 0.0385 seconds. tic for lop2 = 1:1000 y2 = fft(data); end t2 = toc % Elapsed time is 0.0275 seconds. It shows that under this circumstance the computational speed of the Mexfunction is slower than the build-in "fft" function. I am wondering how to solve this problem. Before yesterday I had no idea on the Mexfunction approach, so I need to learn this programming skill at the first stage. The code below is my partial-FFT algorithm: function Y =pfft_lei(data) % This is a radix-2 decimation-in-time PARTIAL-FFT algorithm N = length(data); X=bitrevorder(data); L=log2(N); Y(N,L,2) = zeros; bin1 = path_trace(N); for q = 1:N/2 %Output index selection k3=1; bin = bin1(q,:); for i1=1:L %Iteration stage for i2=bin(i1):k3:N if (Y(i2,i1,2)==0) k=N*(bin(i1)-1)/k3/2; j=i2+k3; W=complex(cos(2*pi*k/N),-sin(2*pi*k/N)); T=X(j)*W; X(j)=X(i2)-T; X(i2)=X(i2)+T; Y(i2,i1,1)=X(i2); Y(j,i1,1)=X(j); Y(i2,i1,2)=1; Y(j,i1,2)=1; else X(i2)=Y(i2,i1,1); end end k3=k3*2; end end function bin=path_trace(N) L = log2(N); for s=1:L bin(:,s)=repmat([1:2^(s-1)],1,N/2^(s-1)); end Currently I don't know how the convert this program into an Mex file. Besides, my algorithm is quite time-comsuming and it does not have any value for practical application. I may need some help on optimizing my code by the Mex approach. Thanks.
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 Prev: Problem with gath geva algorithm Next: geometry profile in pde tool. |