From: Pradipto on 26 May 2010 19:15 Hi, There are two solutions for simple versions of the rolling regression problem available on Matlab Central. Suppose n = rolling window size. Suppose the problem is solve a rolling regression y=a+bx. 1. Now if x is a constant vector in each rolling window, i.e. let size(x,1)=n but size(y,1)>n, then the algorithm described here: http://www.mathworks.com/matlabcentral/newsreader/view_thread/49181 using the filter function works pretty well because it takes advantage of the fact the x (in the example on that webpage x was -20:-1:0) does not change in any rolling window. 2. Suppose however x is not a constant vector, but instead let size(x,1) = size(y,1) > n. Further, during each rolling window we want to regress y(j-n+1:j) against x(j-n+1:j). Then the algorithm described here: http://www.mathworks.com/matlabcentral/newsreader/view_thread/249905 using the cumsum function provides a very neat solution. However, again this takes advantage of the fact that in the case when x is only 1 column, i.e size(x,2)=1, we know the "close form analytic solution", i.e. beta (or the slope) = cov(x,y)/var(x), etc. 3. However consider the more complicated case when the number of columns of x say nc=size(x,2)>1 (and we have size(x,1)=size(y,1) etc) and i want to write a code where nc is flexible. Here are three alternate versions that I have -------- ver 1: totally non-vectorized ------------------ function [a,b]=rollregress(x,y,m) % a = intersect (the beta w.r.t to a dummy column of 1) % b = slopes (the beta for each column of x) [nr,nc]=size(x); x=[ones(nr,1) x]; nc=nc+1; b=NaN(nr-m+1,nc); for k=m:nr b(k-m+1,:)=regress(y(k-m+1:k),x(k-m+1:k,:)); end a=b(:,1); % intersect b=b(:,2:end); % slopes ------------------------------------------------------------------ ver1: time taken for input size of nr = 5114, nc = 2 is 0.551 seconds (averaged over 100 trials). -------- ver 2: partially vectorized ---------------------- function [a,b]=rollregress(x,y,m) % a = intersect (the beta w.r.t to a dummy column of 1) % b = slopes (the beta for each column of x) [nr,nc]=size(x); x=[ones(nr,1) x]; nc=nc+1; p=m+1:nr+1;q=1:nr-m+1; % get the rolling sum of y'*x sxy=[zeros(1,nc); cumsum(x.*y(:,ones(nc,1)),1)]; sxy=sxy(p,:)-sxy(q,:); % get the rolling sum of x'*x % --- also this loop is not bad, because typically nc is small sx2=NaN(nc,nr-m+1,nc); parfor j=1:nc sx2r=[zeros(1,nc); cumsum(x.*x(:,j*ones(nc,1)),1)]; sx2r=sx2r(p,:)-sx2r(q,:); sx2(j,:,:)=sx2r; end % --- this is the part that needs to be vectorized b=NaN(nr-m+1,nc); parfor k=1:nr-m+1 cxy=sxy(k,:); cx2=squeeze(sx2(:,k,:)); b(k,:)=cxy/cx2; end a=b(:,1); % intersect b=b(:,2:end); % slopes ------------------------------------------------------------------ ver2: time taken for input size of nr = 5114, nc = 2 is 0.456 seconds (averaged over 100 trials) with parfor loops (1 pool open) and 0.409 seconds (averaged over 100 trials) with for loops instead of the parfor loops. If I open 4 pools (I have a 4-core PC) using matlabpool('open',4), then the corresponding speed is 0.300 seconds. -------- ver 3: absolutely dumb version & also non-vectorized -- function [a,b]=rollregress(x,y,m) % a = intersect (the beta w.r.t to a dummy column of 1) % b = slopes (the beta for each column of x) [nr,nc]=size(x); x=[ones(nr,1) x]; nc=nc+1; b=NaN(nr-m+1,nc); for k=1:nr-m+1 yt=y(k:k+m-1);xt=x(k:k+m-1,:); b(k,:)=(yt'*xt)/(xt'*xt); end a=b(:,1); % intersect b=b(:,2:end); % slopes ------------------------------------------------------------------ ver3: time taken for input size of nr = 5114, nc = 2 is 0.219 seconds (averaged over 100 trials). ------------------------------------------------------------------ Ver 1 & 3 are also unaffected by the number of pools opened using matlabpool. In summary, the absolute dumbest non-vectorized version is the fastest version. Can anyone provide a faster version OR may be a fully vectorized version? Thanks, P.
|
Pages: 1 Prev: Remove certain spaces of a char matrix Next: Remove certain spaces of a char matrix |