From: Oleg Komarov on
"Li "
> I am working on a set of data with millions of rows. So I have to find the fastest way to process it. Now I just want to remove the rows, if the vector (the first column) in that row is equal to zero. I used the below code, and it took almost an hour to finish. Is there a quicker way to do this? Thanks!
>
> Ind = [];
> for i = 1:m;
> if A(i, 1) == 0 ;
> Ind = [Ind; i];
> end;
> end;
> A (Ind, :) = [];

Vectorize the code:
A(A(:,1) == 0,:) = [];

Oleg
From: XYZ on
On 25.03.2010 21:34, XYZ wrote:
> On 25.03.2010 21:29, Li wrote:
>> I am working on a set of data with millions of rows. So I have to find
>> the fastest way to process it. Now I just want to remove the rows, if
>> the vector (the first column) in that row is equal to zero. I used the
>> below code, and it took almost an hour to finish. Is there a quicker way
>> to do this? Thanks!
>>
>> Ind = [];
>> for i = 1:m; if A(i, 1) == 0 ;
>> Ind = [Ind; i];
>> end;
>> end;
>> A (Ind, :) = [];
>
> A = A(find(A(:,1)),:)

This one should be faster
A = A(A(:,1)~=0,:)
From: Li on
Thanks. I know you meant A = A ( find ( A(:,1)=0 ), :) ?

That is awesome. I appreciate your help very much!


XYZ <junk(a)mail.bin> wrote in message <hoghbr$ids$1(a)news.onet.pl>...
> On 25.03.2010 21:29, Li wrote:
> > I am working on a set of data with millions of rows. So I have to find
> > the fastest way to process it. Now I just want to remove the rows, if
> > the vector (the first column) in that row is equal to zero. I used the
> > below code, and it took almost an hour to finish. Is there a quicker way
> > to do this? Thanks!
> >
> > Ind = [];
> > for i = 1:m; if A(i, 1) == 0 ;
> > Ind = [Ind; i];
> > end;
> > end;
> > A (Ind, :) = [];
>
> A = A(find(A(:,1)),:)
From: Walter Roberson on
XYZ wrote:
> On 25.03.2010 21:34, XYZ wrote:
>> On 25.03.2010 21:29, Li wrote:
>>> I am working on a set of data with millions of rows. So I have to find
>>> the fastest way to process it. Now I just want to remove the rows, if
>>> the vector (the first column) in that row is equal to zero. I used the
>>> below code, and it took almost an hour to finish. Is there a quicker way
>>> to do this? Thanks!
>>>
>>> Ind = [];
>>> for i = 1:m; if A(i, 1) == 0 ;
>>> Ind = [Ind; i];
>>> end;
>>> end;
>>> A (Ind, :) = [];
>>
>> A = A(find(A(:,1)),:)
>
> This one should be faster
> A = A(A(:,1)~=0,:)

Another way would be

A(~(A:,1),:) = [];

I timed (at the command line prompt, not in a function) for a 10000 x 50
matrix. The first time I used XYZ's method and then my method, my method came
out faster. However, on repeated time trials, the time for XYZ's method was
usually half of the first time I measured for it, whereas the timing for my
method varied widely and was often 2 1/2 or so times slower than the better
readings for XYZ's method.

Command line timings do not necessarily reflect what would be seen in a
function, as the "just in time" compiling done in functions can make a
significant difference.
From: Walter Roberson on
Li wrote:
> Thanks. I know you meant A = A ( find ( A(:,1)=0 ), :) ?

No, that is the opposite of XYZ's. That would find the indices where the first
column *was* 0, and save only the rows in which that was true. XYZ's method
finds the indices where the first column is *not* 0 and saves those rows.

>> A = A(find(A(:,1)),:)