From: Jon on
I'm using R2009a and I've recently spent a good deal of time trying to understand why a seemingly simple method was taking FOREVER to run. The bottom line is, I believe, that the built in optimization is lacking in this important case: The speed difference between the fastest and slowest methods is 170 times! I would be interested in comments, and I hope the findings might help others with similar difficulties.

The class: 2 properties: one a large array, and the other some value.
The goal: Use array indexing (as opposed to subscripting) to assign the simple property value to the locations specified by the indices in the large array in a loop (the loop is necessary in code I am working with, and so maintained though in a silly way in this example).

The example class definition:

classdef slowguy < handle
properties
bigdata;
foo = 0.7;
end
methods
function sg = slowguy ()
sg.bigdata = zeros([300 300 40]);
end
function update_slow(sg)
indices=100:10000;
for count=1:100
sg.bigdata(indices)=sg.foo;
end
end
function update_better(sg)
indices=100:100000;
temp = zeros([300 300 40]);
for count=1:100
temp(indices)=sg.foo;
end
sg.bigdata=temp;
end
function update_better2(sg)
indices=100:100000;
temp = zeros([300 300 40]);
myvalue=sg.foo;
for count=1:100
temp(indices)=myvalue;
end
sg.bigdata=temp;
end
function update_best(sg)
indices=100:10000;
temp=sg.foo;
for count=1:100
sg.bigdata(indices)=temp;
end
end
end
end

The results:
>> tic;tester.update_slow;toc;
Elapsed time is 4.493027 seconds.
>> tic;tester.update_better;toc;
Elapsed time is 0.319779 seconds.
>> tic;tester.update_better2;toc;
Elapsed time is 0.315839 seconds.
>> tic;tester.update_best;toc;
Elapsed time is 0.051974 seconds.

To the point, the best performance occurs in update_best (0.05 secs) when we assign sg.foo to a temporary variable "temp" and use "temp" in the assignment to sg.bigdata. However, in update_better and update_better2, we see that they are slower (0.31secs), but not awful. It seems the overhead is in allocating memory, and the loop optimization seems to do the right thing with the sg.foo property access.

But the performance of update_slow is just amazingly abysmal. It's as if matlab were iterating over each index and performing individual assignments! Really I'm at a loss as to why the performance is so bad. If anyone can help, that would be great.
From: per isakson on
"Jon " <moonshadow0357(a)yahoo.com> wrote in message <i3i3sa$ae8$1(a)fred.mathworks.com>...
> I'm using R2009a and I've recently spent a good deal of time trying to understand why a seemingly simple method was taking FOREVER to run. The bottom line is, I believe, that the built in optimization is lacking in this important case: The speed difference between the fastest and slowest methods is 170 times! I would be interested in comments, and I hope the findings might help others with similar difficulties.
>
> The class: 2 properties: one a large array, and the other some value.
> The goal: Use array indexing (as opposed to subscripting) to assign the simple property value to the locations specified by the indices in the large array in a loop (the loop is necessary in code I am working with, and so maintained though in a silly way in this example).
>
> The example class definition:
>
> classdef slowguy < handle
> properties
> bigdata;
> foo = 0.7;
> end
> methods
> function sg = slowguy ()
> sg.bigdata = zeros([300 300 40]);
> end
> function update_slow(sg)
> indices=100:10000;
> for count=1:100
> sg.bigdata(indices)=sg.foo;
> end
> end
> function update_better(sg)
> indices=100:100000;
> temp = zeros([300 300 40]);
> for count=1:100
> temp(indices)=sg.foo;
> end
> sg.bigdata=temp;
> end
> function update_better2(sg)
> indices=100:100000;
> temp = zeros([300 300 40]);
> myvalue=sg.foo;
> for count=1:100
> temp(indices)=myvalue;
> end
> sg.bigdata=temp;
> end
> function update_best(sg)
> indices=100:10000;
> temp=sg.foo;
> for count=1:100
> sg.bigdata(indices)=temp;
> end
> end
> end
> end
>
> The results:
> >> tic;tester.update_slow;toc;
> Elapsed time is 4.493027 seconds.
> >> tic;tester.update_better;toc;
> Elapsed time is 0.319779 seconds.
> >> tic;tester.update_better2;toc;
> Elapsed time is 0.315839 seconds.
> >> tic;tester.update_best;toc;
> Elapsed time is 0.051974 seconds.
>
> To the point, the best performance occurs in update_best (0.05 secs) when we assign sg.foo to a temporary variable "temp" and use "temp" in the assignment to sg.bigdata. However, in update_better and update_better2, we see that they are slower (0.31secs), but not awful. It seems the overhead is in allocating memory, and the loop optimization seems to do the right thing with the sg.foo property access.
>
> But the performance of update_slow is just amazingly abysmal. It's as if matlab were iterating over each index and performing individual assignments! Really I'm at a loss as to why the performance is so bad. If anyone can help, that would be great.

I cannot help, but thank you for pointing at the problem.

I us R2010a (64bit) on a
Processor Intel(R) Core(TM)2 Quad CPU Q9400 2.66GHz 2,67 GHz
Installed memory (RAM): 8,0 GB (7,87 GB usable)
System type 64-bit Operating System (Windows 7)

I have run slowguy with tic,toc and with PROFILE and the same with slowguy_poi. In slowguy_poi I have changed to logical indexing (see below).

According to PROFILE it is the assignment in the loop that takes "all" the time in all cases.

My result (profile and tic,toc agree):
slowguy slowguy_poi
slow 2.3 2.8
better 0.09 0.59
better2 0.08 0.59
best 0.011 0.58

With logical indexing there is no difference between better, better2 and best. For better, better2 and best logical indexing is an order of magnitude slower. Only 0.3 per cent of the elements are changed in the assignment.
(>> numel([100:10000])/numel(zeros([300 300 40])), ans = 0.0028 )

slowguy_poi
function update_best(sg)
is = false( numel( sg.bigdata ), 1 );
ix = 100:10000;
is(ix) = true;
temp=sg.foo;
for count=1:100
sg.bigdata(is)=temp;
end
end

/ per
From: Matt J on
"Jon " <moonshadow0357(a)yahoo.com> wrote in message <i3i3sa$ae8$1(a)fred.mathworks.com>...
> I'm using R2009a and I've recently spent a good deal of time trying to understand why a seemingly simple method was taking FOREVER to run. The bottom line is, I believe, that the built in optimization is lacking in this important case: The speed difference between the fastest and slowest methods is 170 times! I would be interested in comments, and I hope the findings might help others with similar difficulties.
==================

It's peculiar indeed. My one comment, for now, would be that the problem seems to occur only for handle classes. When you take away the "< handle" subclassing, the speed differences go away.


>> tic;tester.update_slow;toc;
Elapsed time is 0.113028 seconds.

>> tic;tester.update_better;toc;
Elapsed time is 0.263146 seconds.

>> tic;tester.update_better2;toc;
Elapsed time is 0.249069 seconds.

>> tic;tester.update_best;toc;
Elapsed time is 0.116306 seconds.
 | 
Pages: 1
Prev: Summation
Next: matrix multiplication_new_user