From: Shu Heng Shu Heng on
I tried performing summation of an array, A, using two ways:

1) sum(A)
2)x=0;
for i=1:n %n is the size of the array
x=x+A(i);
end
x %x is the sum

I found that method 1 is significantly faster, especially when the array size is large. At size of 1048576 the first method yields result in 0.0035 s whereas the second method yields result in 1.7 s. (I use ones vector for testing)

Why is there such a big difference in computation time? I can't think of other more efficient ways to perform summation serially than method 2. Does Matlab make use of the multiple processor capabilities of CPU?
From: Walter Roberson on
Shu Heng Shu Heng wrote:
> I tried performing summation of an array, A, using two ways:
>
> 1) sum(A)
> 2)x=0;
> for i=1:n %n is the size of the array
> x=x+A(i);
> end
> x %x is the sum
>
> I found that method 1 is significantly faster, especially when the array
> size is large. At size of 1048576 the first method yields result in
> 0.0035 s whereas the second method yields result in 1.7 s. (I use ones
> vector for testing)

> Why is there such a big difference in computation time?

Because every step of the summation loop has to be processed through the
Matlab interpreter. If you had put the functionality into a function,
then it would only need to parse the code once, but it would still need
to put every step of the code through the threaded interpreter. The
built-in sum() function, on the other hand, is compiled down to machine
language.

Also, once you hit a sufficient size, the summation would probably be
passed off to the highly-optimized BLAS library, which would use
multiple threads or CPUs if available. The size you indicate is probably
large enough to be handed to BLAS.
From: dpb on
Shu Heng Shu Heng wrote:
....

> Why is there such a big difference in computation time? I can't think of
> other more efficient ways to perform summation serially than method 2.
> Does Matlab make use of the multiple processor capabilities of CPU?

The internal sum() (as well as other "built-in" functions) of Matlab is
compiled/optimized code whereas command-line code is tokenized to call
whatever operations are required to perform the given operation. Mostly
what you see in such simple tests is the function overhead rather than
the computation of a sum itself.

Somebody else will have to comment on what later versions of ML do w/
multi-cores automagically....

--
From: us on
"Shu Heng Shu Heng" <scilover_8(a)hotmail.com> wrote in message <hsp13v$671$1(a)fred.mathworks.com>...
> I tried performing summation of an array, A, using two ways:
>
> 1) sum(A)
> 2)x=0;
> for i=1:n %n is the size of the array
> x=x+A(i);
> end
> x %x is the sum
>
> I found that method 1 is significantly faster, especially when the array size is large. At size of 1048576 the first method yields result in 0.0035 s whereas the second method yields result in 1.7 s. (I use ones vector for testing)
>
> Why is there such a big difference in computation time? I can't think of other more efficient ways to perform summation serially than method 2. Does Matlab make use of the multiple processor capabilities of CPU?

a hint:
- case 1: calls a compiled subroutine
- case 2: calls the command interpreter

us
From: Matt Fig on
As mentioned previously, putting your code into a function gives much better performance to the FOR loop. For example, I get an reltime of 1.9 on my old machine running 2007b.





function [] = compare_summs()
% Spits out the relative speed of built-in SUM to a FOR loop.
A = rand(1,1048576);
T = [0 0];

tic
X = sum(A);
T(1) = toc;

tic
x = 0;
for ii = 1:1048576
x = x + A(ii);
end
T(2) = toc;

reltime = T(2)/T(1)