From: Raphael on
"Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <hqaalt$hle$1(a)fred.mathworks.com>...
> "Yi Cao" <y.cao(a)cranfield.ac.uk> wrote in message <

d = 1./sum(Neigh);
d = spdiags(d(:),0,n,n);
AverageVelocity = d*(Neigh*Velocity);

with parentheses like Yi Cao calculated is indeed very fast (and easy). Great !

Only issue : it doesn't take care of NaNs.

I'm thinking that putting rows and columns i in Neigh to 0 if Velocity(i,:) is NaN should work. Shouldn't it ?
From: Bruno Luong on
Like YC's matrix product but, taking into account NaN:

V = Velocity;
BAD = isnan(V);
V(BAD) = 0;
V = Neigh*[V ~BAD];
V = reshape(V,[n 3 2]);
AverageVelocity = V(:,:,1)./V(:,:,2);

% Bruno
From: vortse a on
That's it! I have been bested in all accounts by the masters of m-code!

I found the bug in my code (I had stupidly reversed as and bs in find), I corrected Raphael's code, I updated Yi Cao's code to use spdiags and added Bruno Luong's accumarray solution to the comparison. The result speaks for itself:

n=10000, time in seconds, profiler on
thresh YiCao#3 Raphael vortse a B.Luong
0.60 0.7334 1.3698 6.6863 4.5849
0.80 0.1214 0.6612 1.8262 1.2387
0.90 0.0316 0.5022 0.5181 0.3029
0.95 0.0094 0.4235 0.1579 0.0800
0.99 0.0027 0.4013 0.0348 0.0050
0.997 0.0022 0.3972 0.0096 0.0029
0.999 0.0020 0.3904 0.0059 0.0030

I never thought I'd see such a speedup, so simply.

the new comparison code:

clear all
% some example data
n=10000;
Velocity=randn(n,3);
Neighseed=rand(n)+speye(n); % now all points are neighbour to themselves
sparsity=[0.6 0.8 0.9 0.95 0.99 0.997 0.999];
tims=zeros(length(sparsity),4);
% for different sparsity cases
for spk=1:length(sparsity)
Neigh0=Neighseed>sparsity(spk);
Neigh0= Neigh0 & Neigh0';
Neigh=sparse(Neigh0);
%
% Yi Cao's code
tic
d = 1./sum(Neigh); % updated 2nd time!
d = spdiags(d(:),0,n,n);
AverageVelocity1 = d*(Neigh*Velocity);
% AverageVelocity1 = diag(1./sum(Neigh,2))*(Neigh*Velocity);
% AverageVelocity1 = diag(1./sum(Neigh))*Neigh*Velocity;
t1=toc

% Raphael's code
tic
AverageVelocity2=NaN(n, 3);
for i=1:n
% AverageVelocity2(i,:)=nanmean(Velocity(Neigh(:,i),:));
AverageVelocity2(i,:)=nanmean(Velocity(Neigh(:,i),:),1); % updated!
end
t2=toc

% vortse a's code
tic
% preparation of indexes for neighbour summation (vectorised)
[bs,as]=find(Neigh); % neighbour indices, as is sorted
nbs=full(sum(Neigh,1)); % number of neighbours per point
is=1:length(nbs);
nbzero=nbs==0;
nbnz=not(nbzero);
a=nbs;
isnn=is(nbnz);
nbss=cumsum(a);
c=[0 nbss(1:end-1)];
mxa=max(a);
mxav=(1:mxa).';
inbc=bsxfun(@le,mxav,a);
b=bsxfun(@times,mxav,inbc);
nbc=bsxfun(@plus,b,c);
extra=nbss(end)+1; % location of extra element for padding
nbc(~inbc)=extra;
%
vb=[Velocity(bs,:);[NaN NaN NaN]]; % neighbour velocities for all points
AverageVelocity3=NaN(n,3);
for dim=1:3
vbd=vb(:,dim);
AverageVelocity3(:,dim)=nanmean(vbd(nbc'),2);
end
t3=toc

% Bruno Luong's code
tic
AverageVelocity4 = nan(n, 3);
[i j] = find(Neigh);
for k=1:3
vk = Velocity(:,k);
uk = isnan(vk);
vk(uk) = 0;
s = accumarray(j, vk(i), [n 1]);
c = accumarray(j, 1-uk(i), [n 1]);
AverageVelocity4(:,k) = s./c;
end
t4=toc
%
tims(spk,1:4)=[t1 t2 t3 t4];
% check results
resultcheck12=all(lt(abs(AverageVelocity1-AverageVelocity2),10^-8))
resultcheck32=all(lt(abs(AverageVelocity3-AverageVelocity2),10^-8))
resultcheck42=all(lt(abs(AverageVelocity4-AverageVelocity2),10^-8))
end
%
semilogy(sparsity,tims,'-*','DisplayName','time comparison');
legend('Yi Cao #3','Raphael','vortse a', 'Bruno Luong');
From: Raphael on
"Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <hqaek9$qic$1(a)fred.mathworks.com>...
> Like YC's matrix product but, taking into account NaN:
>
> V = Velocity;
> BAD = isnan(V);
> V(BAD) = 0;
> V = Neigh*[V ~BAD];
> V = reshape(V,[n 3 2]);
> AverageVelocity = V(:,:,1)./V(:,:,2);
>
> % Bruno


Simply brilliant.
It even "recovers" the NaNs by dividing 0 by 0. It is also super fast.
Thank you very much.