From: Juliette Salexa on
Hello,

I have the array:

A=[1 ;
2 ;
3 ;
4 ;
5 ;
6 ;
7 ;
8 ;
9]

And I want to produce the array:

B=[1+4+7;
1+4+7;
1+4+7;
2+5+8 ;
2+5+8 ;
2+5+8 ;
3+6+9 ;
3+6+9 ;
3+6+9 ]

I've tried the following methods, but they are both too slow when A is large (~ size=4^14,1) and the method is part of a loop that's iterated 1000s of times.
==============================
METHOD 1:

A=expand(sum(reshape(A,length(A)/3,3),2),[3,1]);

where expand is:
http://www.mathworks.com/matlabcentral/fileexchange/24536-expand

I'm aware that expand is an m-file and reshape is built-in, but based on the output of my profiler, the reshape is still taking a significant percentage of time.

I've also tried splitting it up:

B=zeros(length(A)/3,3);
B=reshape(A,length(A)/3,3);
A=expand(sum(B,2),[3,1]);

But the difference in time is less than a second.

I think reshape might be slow because matlab stores elements column-wise, and by using reshape it has to find three new contiguous blocks of memory and store some elements along rows ?
==============================
METHOD 2:

repetitionIndex=repmat((1:3)',3,1);
A=expand(accumarray(repetitionIndex,A(:)),[3,1]);

This method is a little bit slower than method 1, probably because repetitionIndex is taking up extra memory, but I'm again not 100% sure.

==============================

Does anyone think there's a faster way ??

Any suggestions or discussion would be extremely appreciated.
Juliette
From: Doug Schwarz on
Juliette Salexa wrote:
> Hello,
>
> I have the array:
>
> A=[1 ;
> 2 ;
> 3 ;
> 4 ;
> 5 ; 6 ;
> 7 ;
> 8 ;
> 9]
>
> And I want to produce the array:
>
> B=[1+4+7;
> 1+4+7;
> 1+4+7;
> 2+5+8 ;
> 2+5+8 ;
> 2+5+8 ;
> 3+6+9 ;
> 3+6+9 ;
> 3+6+9 ]
>
> I've tried the following methods, but they are both too slow when A is
> large (~ size=4^14,1) and the method is part of a loop that's iterated
> 1000s of times.
> ==============================
> METHOD 1:
>
> A=expand(sum(reshape(A,length(A)/3,3),2),[3,1]);
>
> where expand is:
> http://www.mathworks.com/matlabcentral/fileexchange/24536-expand
>
> I'm aware that expand is an m-file and reshape is built-in, but based on
> the output of my profiler, the reshape is still taking a significant
> percentage of time.
>
> I've also tried splitting it up:
>
> B=zeros(length(A)/3,3);
> B=reshape(A,length(A)/3,3);
> A=expand(sum(B,2),[3,1]);
>
> But the difference in time is less than a second.
>
> I think reshape might be slow because matlab stores elements
> column-wise, and by using reshape it has to find three new contiguous
> blocks of memory and store some elements along rows ?
> ==============================
> METHOD 2:
>
> repetitionIndex=repmat((1:3)',3,1);
> A=expand(accumarray(repetitionIndex,A(:)),[3,1]);
>
> This method is a little bit slower than method 1, probably because
> repetitionIndex is taking up extra memory, but I'm again not 100% sure.
>
> ==============================
>
> Does anyone think there's a faster way ??
>
> Any suggestions or discussion would be extremely appreciated.
> Juliette


reshape never takes much time because it never moves data.

I would do it this way:

temp = repmat(sum(reshape(A,[],3),2),1,3).';
B = temp(:);

--
Doug Schwarz
dmschwarz&ieee,org
Make obvious changes to get real email address.
From: Sean on
"Juliette Salexa" <juliette.physicist(a)gmail.com> wrote in message <hv8bq4$nv5$1(a)fred.mathworks.com>...
> Hello,
>
> I have the array:
>
> A=[1 ;
> 2 ;
> 3 ;
> 4 ;
> 5 ;
> 6 ;
> 7 ;
> 8 ;
> 9]
>
> And I want to produce the array:
>
> B=[1+4+7;
> 1+4+7;
> 1+4+7;
> 2+5+8 ;
> 2+5+8 ;
> 2+5+8 ;
> 3+6+9 ;
> 3+6+9 ;
> 3+6+9 ]
>
> I've tried the following methods, but they are both too slow when A is large (~ size=4^14,1) and the method is part of a loop that's iterated 1000s of times.
> ==============================
> METHOD 1:
>
> A=expand(sum(reshape(A,length(A)/3,3),2),[3,1]);
>
> where expand is:
> http://www.mathworks.com/matlabcentral/fileexchange/24536-expand
>
> I'm aware that expand is an m-file and reshape is built-in, but based on the output of my profiler, the reshape is still taking a significant percentage of time.
>
> I've also tried splitting it up:
>
> B=zeros(length(A)/3,3);
> B=reshape(A,length(A)/3,3);
> A=expand(sum(B,2),[3,1]);
>
> But the difference in time is less than a second.
>
> I think reshape might be slow because matlab stores elements column-wise, and by using reshape it has to find three new contiguous blocks of memory and store some elements along rows ?
> ==============================
> METHOD 2:
>
> repetitionIndex=repmat((1:3)',3,1);
> A=expand(accumarray(repetitionIndex,A(:)),[3,1]);
>
> This method is a little bit slower than method 1, probably because repetitionIndex is taking up extra memory, but I'm again not 100% sure.
>
> ==============================
>
> Does anyone think there's a faster way ??
>
> Any suggestions or discussion would be extremely appreciated.
> Juliette

How much RAM do you have? 4^14 of class double requires a lot of memory (2.147483648GB) and a second array the same size will require as much. If the numbers are all integers and depending on the maximum integer you have you could recast them as something that takes up less RAM.

The big question to ask is why do you need to do this and is there another way around it? It's probably unnecessary to have the same number repeated 3x over and over again.

Here's another way too:
A = [1:12]'; %to 12 so sizes are 3,4 to make it distinguishable
B = repmat(sum(reshape(A,[3 4]),2),1,3)';
C = B(:);
From: Juliette Salexa on
Thanks Sean and Doug.

I can't see the difference between your methods though! =(

I do appreciate that using REPMAT instead of EXPAND speeds things up by about a factor of 3.

But, as I said in the initial post, the profiler suggests that RESHAPE is taking up a significant percentage of the time:

===================
A=(1:3^12)';I=rand(3^12,1);
B=zeros(length(A)/3,3)';

profile on
for i=1:1000
temp=repmat(sum(reshape(A,length(A)/3,3),2),1,3).';
A=temp(:);
A=A.*I;
end
profile viewer
===================

Out of the 57 seconds, 14.5 seconds were from using REPMAT,

If I avoid call SUM and LENGTH:

lengthOfAover3=length(A)/3;
for i=1:1000
temp=reshape(A,lengthOfAover3,3);
temp2=repmat(temp(:,1),1,3);
A=temp2(:);
A=A.*I;
end

===================

Out of the 39 seconds, 13.5 seconds were from using REPTMAT

=> which leads me to believe RESHAPE is taking the majority of the time.

Considering that I don't actually *need* the reshaped array, and that the pattern of indices over which I'm summing is very simple, I was hoping there would be a way to do this without RESHAPE.

Also, I've been told that the transpose is very costly for large matrices since matlab stores elements column-wise.

There must be a more efficient way to add every Nth element and store them in an array!



Thanks sean for your concern, but I need to use that much memory since I'm reusing the result 1000's of times and it would be a nightmare to recalculate it each time. And the reason why I'm making 3 copies of the same elements is that (as in the example above), I'm multiplying each of the 3 copies by a different number - If I could get A.*I (as in the above example) without having to make 3 copies of each element of A, that would be very nice! But I can't think of a way to do that =(


Thanks again for your help and suggestions
From: Matt J on
"Juliette Salexa" <juliette.physicist(a)gmail.com> wrote in message <hv8i1e$3g4$1(a)fred.mathworks.com>...
seconds, 14.5 seconds were from using REPMAT,
>
> If I avoid call SUM and LENGTH:
>
> lengthOfAover3=length(A)/3;
> for i=1:1000
> temp=reshape(A,lengthOfAover3,3);
==============

As an FYI, you don't need to be computing lengthOfAover3. You can just do

temp=reshape(A,[],3);

*************************
> temp2=repmat(temp(:,1),1,3);
> A=temp2(:);
> A=A.*I;
> end
>
> ===================
>
> Out of the 39 seconds, 13.5 seconds were from using REPTMAT
>
> => which leads me to believe RESHAPE is taking the majority of the time.
=================

You're ignoring the contribution of A.*I. No doubt that contributes an appreciable fraction.


******************
> Thanks sean for your concern, but I need to use that much memory since I'm reusing the result 1000's of times and it would be a nightmare to recalculate it each time. And the reason why I'm making 3 copies of the same elements is that (as in the example above), I'm multiplying each of the 3 copies by a different number - If I could get A.*I (as in the above example) without having to make 3 copies of each element of A, that would be very nice! But I can't think of a way to do that =(
=========================

You mean you have a vector and need to multiply it by different scalars? Again, it would be good to know what that result is going to feed into. It seems like a lot of redundant calculation. In any case, you can avoid using repmat by using bsxfun instead, which is more memory efficient, e.g.

>> bsxfun(@times,[1;2;3],[10,11,12])

ans =

10 11 12
20 22 24
30 33 36
 |  Next  |  Last
Pages: 1 2 3
Prev: MatLab Busy - for loop
Next: Modeling a Radar System