From: vortse a on
"Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <hqa03o$aar$1(a)fred.mathworks.com>...
>
> How many "1" Neigh has by column in average? If it's much smaller than n, you should take a look of SPARSE matrtix.
>
> Bruno

I think Rafael mentioned that his Neigh matrix was already sparse. However sparsity is the key to optimise performance.

I had a similar sparse averaging problem a couple of months ago and I used a vectorised approach to perform summation of terms based on a zero-padded index matrix. I had put quite a bit of effort into this, so I gave it a try earlier with this problem here. I was dissappointed to find that my code was slower than Rafael's loop. Then I noticed that Yi Cao had used a sample matrix made with rand(n)>0.6, which has rather low sparsity. So I tried it with samples of varying sparsity:

n=10000, time in seconds
thresh YiCao Raphael vortsea
0,60 341,2 4,278 7,034
0,80 106,4 2,747 3,033
0,90 28,55 0,725 0,493
0,95 13,23 0,458 0,154
0,99 1,809 0,439 xxxxx*

* crash due to a bug that I resolved afterwards

n=5000, time in seconds
thresh YiCao Raphael vortsea
0,60 40,91 0,397 1,579
0,80 11,23 0,239 0,496
0,90 3,738 0,186 0,152
0,95 1,771 0,180 0,0522
0,99 0,390 0,177 0,0063

My code has to use a safeguard for the cases where there are no neighbours at all, in which case the result is NaN, but it can be easily averaged with the point's original velocity.

Also you should have a look at the results of Rafael's and YiCao's code for the 3 dimensions, I feel that in the case where there is only one neighbour, the function mean screws things up, because x,y,z velocities for those points turn up to be equal, which shouldn't be.

Here's my comparison script:

clear all
% some example data
n=5000;
Velocity=randn(n,3);
Neighseed=rand(n)-speye(n);
sparsity=[0.6 0.8 0.9 0.95 0.99];
tims=zeros(length(sparsity),3);
for k=1:length(sparsity)
Neigh0=Neighseed>sparsity(k);
Neigh0= Neigh0 & Neigh0';
Neigh=sparse(Neigh0);
% Yi Cao's code
tic
AverageVelocity1 = diag(1./sum(Neigh))*Neigh*Velocity;
t1=toc

% Raphael's old code
tic
AverageVelocity2=NaN(n, 3);
for i=1:n
AverageVelocity2(i,:)=nanmean(Velocity(Neigh(:,i),:));
end
t2=toc

% vortse a's code
tic
% preparation of indexes for neighbour summation (vectorised)
[as,bs]=find(Neigh); % neighbour indices, bs is sorted
nbs=full(sum(Neigh,1)); % number of neighbours per point
is=1:length(nbs);
nbzero=nbs==0;
nbnz=not(nbzero);
a=nbs;
isnn=is(nbnz);
nbss=cumsum(a);
c=[0 nbss(1:end-1)];
mxa=max(a);
mxav=(1:mxa).';
inbc=bsxfun(@le,mxav,a);
b=bsxfun(@times,mxav,inbc);
nbc=bsxfun(@plus,b,c);
extra=nbss(end)+1; % location of extra element for padding
nbc(~inbc)=extra;
%
vb=[Velocity(bs,:);[NaN NaN NaN]]; % neighbour velocities for all points
AverageVelocity3=NaN(n,3);
for dim=1:3
vbd=vb(:,dim);
AverageVelocity3(:,dim)=nanmean(vbd(nbc'),2);
end
t3=toc
%
tims(k,1:3)=[t1 t2 t3];
% check results
resultcheck12=all(lt(abs(AverageVelocity1-AverageVelocity2),10^-8))
resultcheck32=all(lt(abs(AverageVelocity3-AverageVelocity2),10^-8))
end
%
semilogy(sparsity,tims,'-*','DisplayName','time comparison');
legend('Yi Cao','Rapahel','vortse a');
From: vortse a on
"Raphael " <raphael.keller(a)gmail.com> wrote in message <hqa357$597$1(a)fred.mathworks.com>...
>
> > Therefore, by changing calculation sequence, we can reduce operations from (n+3)n^2 to 6n^2. For large n, this will significantly reduce time. It should be compatible with or even faster than the loop based code.
> >
> > HTH
> > Yi
> >
>
> It's not faster, but no more than 10% slower.
> Thank you very much for your answer.

I updated Yi Cao's part of the code and it runs at the same speed as yours more or less. However, for any threshold above 0,87 my code beats them both.

On correctness: I don't have time to check now, but it seems Yi Cao's code is correct for 1 neighbour, while yours isn't. Mine yields totally different numbers, which I guess is a bug with the creation of matrix "nbc". I'll try to find what's wrong tomorrow.
From: Bruno Luong on
What about this:

% Data
n=10000;
Velocity = rand(n,3);
Velocity(ceil(rand(1,100)*numel(Velocity))) = NaN;
Neigh = logical(sprand(n,n,0.01));

% Engine
AverageVelocity = nan(n, 3);
[i j] = find(Neigh);
for k=1:3
vk = Velocity(:,k);
uk = isnan(vk);
vk(uk) = 0;
s = accumarray(j, vk(i), [n 1]);
c = accumarray(j, 1-uk(i), [n 1]);
AverageVelocity(:,k) = s./c;
end
From: Bruno Luong on
"Yi Cao" <y.cao(a)cranfield.ac.uk> wrote in message <
>
> The timing issue does not come from diag, but from calculation sequency. In the orginal code, I gave, it will firstly calculate diag(1./sum(Neigh,2))*Neigh, which is n^3 operations. Then, the results multiply Velocity, which is 3n^2 operations.

Is there any the reason to insist on using a full array for a large diagonal matrix? this should work much better:

d = 1./sum(Neigh);
d = spdiags(d(:),0,n,n);
AverageVelocity = d*Neigh*Velocity;

It might be worth to play with the product order for different scenario of sizes.

... = (d*Neigh)*Velocity OR
... = d*(Neigh*Velocity)

Bruno
From: Raphael on

> On correctness: I don't have time to check now, but it seems Yi Cao's code is correct for 1 neighbour, while yours isn't. Mine yields totally different numbers, which I guess is a bug with the creation of matrix "nbc". I'll try to find what's wrong tomorrow.


The dimension along which the mean is taken must be forced, you are right, thank you for pointing this out! So the code should read

for i=1:n
AverageVelocity2(i,:)=nanmean(Velocity(Neigh(:,i),:),1);
end

nb : Cells are always neighbours with themselves in my calculation (it makes sense for the averaging).


As for Bruno's code : with the data I work with (n between 5000 to 10000, sparsity>0.99), it is 10% faster than "my" code. I guess that's not enough to compete with vortse a's version !