Growing matrix [Matlab]

Prev: Command to display what Matlab is doing?
Next: Muscle Wrapping over an cylinder - Shortest path

From: Thibault Daoulas on 3 Jun 2010 16:03

"Steve Amphlett" <Firstname.Lastname(a)Where-I-Work.com> wrote in message <hu8rfs$1da$1(a)fred.mathworks.com>...
> "Alan B" <monguin61REM(a)OVETHIS.yahoo.com> wrote in message <hu8r4i$7cq$1(a)fred.mathworks.com>...
> > dpb <none(a)non.net> wrote in message <hu8o1b$ok9$1(a)news.eternal-september.org>...
> > > Steven Lord wrote:
> > > > "Thibault Daoulas" <thibault.daoulas(a)gmail.com> wrote in message
> > > > news:hu882k$lsj$1(a)fred.mathworks.com...
> > > >> Hi all,
> > > >> I have been reading and trying several ways suggested here to handle
> > > >> growing matrices over loops. My concern starts the same way :
> > > >> - Unkown size of a matrix, and at each step, a new row is concatenated
> > > >
> > > > Do you know an upper bound on the size of your matrix (ideally one that's
> > > > not too much larger than the actual size your matrix will be in the end?)
> > > > If so, preallocate it to be that size and trim it at the end.
> > >
> > > What he said... :)
> > >
> > > Was just getting ready to respond to the other posting w/ the
> > > (apparently ubiquitous) "double-the-previously-allocated-size" algorithm
> > > that just never made sense to me as a logical approach unless the size
> > > expected was truly absolutely unknown and there was no way to even guess
> > > whether were reaching the end or not during the process. Continuing to
> > > try to double the size of an allocation just reeks of problems to me;
> > > how well it actually works in general use I don't actually know because
> > > I've never implemented it as a technique owing to the aforementioned
> > > concerns.
> > >
> > > --
> > >
> >
> > I myself haven't used this, but wouldn't it make sense if the possible array size ranges from some small number up to the limit of what fits in your machine's memory? My desktop isn't too happy when I try to allocate 50 million zeros, so I would want to avoid doing so, if that were really my best guess for the size.
> >
> > Like I said, I don't know much about the Matlab internals, so I have no idea why appending large arrays to a cell array makes sense, but appending zeros to a normal array doesn't. I just figured, if you have no choice but to repeatedly allocate additional space, you may as well minimize the allocation overhead by doing it ~log(N) times instead of ~N times.
>
> Oops, I should do more reading and less writing!
>
> Anyway, in one of our (non-Matlab) programs, we have a user-defined upper memory limit (akin to a max Matrix size). When we hit that, we decimate by a factor of 2, point our index to the middle and then start writing every 2 steps instead of every 1 step. Etc...

Waw, thanks for your swift replies !
I could sort of guesstimate the maximum size of the array, which should not reach 3.10⁶, but the point with this is that I need this array to be persistent, and that leads to extremely slow execution time. What takes most of the time according to the profiler is the overhead and built-ins, along with, I think, keeping the matrix persistent along the whole process. The actual operations on its elements are really not that complexed...
That's funny though how the execution time gets exponentially mad as the loop size increases...

From: dpb on 3 Jun 2010 16:31

Thibault Daoulas wrote:
....

> I could sort of guesstimate the maximum size of the array, which should
> not reach 3.10⁶, but the point with this is that I need this array
> to be persistent, and that leads to extremely slow execution time. What
> takes most of the time according to the profiler is the overhead and
> built-ins, along with, I think, keeping the matrix persistent along the
> whole process. The actual operations on its elements are really not that
> complexed...
> That's funny though how the execution time gets exponentially mad as the
> loop size increases...

Don't follow about the "persistent" bit.

Think we need to see code here.

One thing I can think of would be if you're not addressing data linearly
in memory.

--

From: Thibault Daoulas on 4 Jun 2010 03:30

dpb <none(a)non.net> wrote in message <hu93jm$efh$3(a)news.eternal-september.org>...
> Thibault Daoulas wrote:
> ...
>
> > I could sort of guesstimate the maximum size of the array, which should
> > not reach 3.10⁶, but the point with this is that I need this array
> > to be persistent, and that leads to extremely slow execution time. What
> > takes most of the time according to the profiler is the overhead and
> > built-ins, along with, I think, keeping the matrix persistent along the
> > whole process. The actual operations on its elements are really not that
> > complexed...
> > That's funny though how the execution time gets exponentially mad as the
> > loop size increases...
>
> Don't follow about the "persistent" bit.
>
> Think we need to see code here.
>
> One thing I can think of would be if you're not addressing data linearly
> in memory.
>
> --
What I meant was just that this piece of code which stores/updates/manipulates the matrix is called from some outter process, hence declared as persistent, in Matlab terms. I suspected that to be the cause of such slow handling, compared to for instance dealing with it as a global variable, though I'm not really aware of what this difference implies in the memory.

As for the code, it's that kind of things :

main.m :
for ii=length(X)
%do a bunch of things
my_slow_function(X, a(ii), b(ii), c(ii));
end

my_slow_funcion.m :
function result = my_slow_funcion(A, B, C)
persistent my_matrix;
%fill my_matrix with A, B, C
my_matrix = cat(1, my_matrix, [A B C]);
%compute stuff from the whole data contained in my_matrix
getOldData = min(abs(my_matrix(:, 1)-A/1000))
getOldMean = mean(my_matrix(A:B));
result = [getOldData getOldMean];

So yeah, that's the idea, and I'm actually assigning data linearly, as only 3 numbers are appened at each execution of my_slow_function. What takes more time obviously with the size is that :
- min(abs(my_matrix(:, 1)-A/1000))
But I don't think this is the cause of the problem. Apparently it's keeping my_matrix up to date, but again, I'm doubtful...

From: Bruno Luong on 4 Jun 2010 04:34

"Steve Amphlett" <Firstname.Lastname(a)Where-I-Work.com> wrote in message <hu8qjp$1hu$1(a)fred.mathworks.com>...

>
> [m,n]=size(x);
> x(m*2,1)=0;

Or with a single-line command (should not start from empty though)

x(2*end,end)=0;

Bruno

From: Thibault Daoulas on 4 Jun 2010 05:58

"Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <huadq5$lbl$1(a)fred.mathworks.com>...
> "Steve Amphlett" <Firstname.Lastname(a)Where-I-Work.com> wrote in message <hu8qjp$1hu$1(a)fred.mathworks.com>...
>
> >
> > [m,n]=size(x);
> > x(m*2,1)=0;
>
> Or with a single-line command (should not start from empty though)
>
> x(2*end,end)=0;
>
> Bruno

Yep, that makes sense.
The persistent thing is not plumbing the whole thing indeed, but rather the operations on the matrix. For instance among others I calculate a mean each time, and writing a small example, I found out this was at least one of the bad stuff that make the execution time sink..
I shoudn't be suprised about that, but that is quite annoying, as I need to keep this history of means... Any suggestion on that big boys ? :)

%----------------------------------------
tic
buffSizeM=10000;
bufM=zeros(buffSizeM, 3);

for ii=1:buffSizeM
bufM(ii, :)=[ii ii ii/3];
mean(bufM(1:ii, 2));
end
toc
%----------------------------------------

buffSize = 10000 : Elapsed time is 0.576963 seconds.
buffSize = 20000 : Elapsed time is 2.136694 seconds.
buffSize = 50000 : Elapsed time is 10.099831 seconds.
buffSize = 100000 : Elapsed time is 48.111217 seconds.

First | Prev | Next | Last
Pages: 1 2 3 4
Prev: Command to display what Matlab is doing?
Next: Muscle Wrapping over an cylinder - Shortest path