Index matrix to sparse matrix... saving memory [Matlab]

Prev: reading a bunch of data files with non-consecutive file names
Next: Simscape>Mechanical Source

From: Anthony Hopf on 9 Jun 2010 10:21

Thank you Matt and Bruno!! I apologize for not staying with this topic, but I have quite a bit on my plate... and I am taking your advice, and TRYING to learn. You have both been very helpful and I hope you won't give up on me yet!!

The reason I didn't stick with Matt's suggestion is I could never get values into the cells I and J, and then I got pulled from my work (circumstances that I do not wish to bring up).

I am working with your suggestions and trying to apply the code to a 3d matrix, but first I started playing with your code to see what happens if there are NaN values within r and I run into a problem that I thought I could explain, the error I receive is:

??? Error using ==> sparse
Index into matrix must be positive.

I realize the problem is that bin will have a 0 values due to the fact there is an out of bounds points... so I thought a remedy would be:

r_4d=peaks(50)+7; % + 7 to make data positive
r_4d(25:30,25:30) = NaN;
bin_size = 0.1;
%[ index_range ] = range_bin_sparse( r_4d, bin_size );

% Here is a way to use HISTC
nbins = 151;%ceil(max(r_4d(:))/bin_size);
r_bin = (0:nbins)*bin_size;
[null,bin] = histc(r_4d,r_bin);
bool = sparse(1:nnz(bin), nonzeros(bin),true);

As you notice I have modified r_4d(25,25) to be a NaN value and commented out index_range (this line also perplexes me). My final two changes are in the sparse command where I try and compensate for the fact that there are out of bounds points... this seems to mess things up, when I go to a much larger data set with a 3d matrix r_4d the code runs, the number of nnz points in bool matches the number of ~isnan(r_4d) so I know histc is still working... but when I index using bool the returned values are not correct.

for the above code:
without NaNs:
bool(:,50)

ans =

(525,1) 1
(526,1) 1
(527,1) 1
(528,1) 1
(580,1) 1
(826,1) 1
(830,1) 1
(1013,1) 1
(1064,1) 1
(1206,1) 1
(1406,1) 1
(1517,1) 1

with NaNs:
bool(:,50)

ans =

(525,1) 1
(526,1) 1
(527,1) 1
(528,1) 1
(580,1) 1
(826,1) 1
(830,1) 1
(1013,1) 1
(1064,1) 1
(1206,1) 1
(1382,1) 1
(1481,1) 1

Can you please give me a little more advice on this?

Thank you

"Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <hum8vg$65s$1(a)fred.mathworks.com>...
> First, this a an month-old thread. I'm sorry but if you haven't not pay enough attention to what has been suggested for month (including Matt's remark about the miss-use of sparse), the a lot time is waste.
>
> Now it will be the last time to write something on this topic, then I'll move on. Here we go:
>
> First I change your original function by removing the garbage inside, and also change one comparison test (from > to >=) to match HISTC.
>
> %%
> function [ index_range ] = range_bin_sparse( r_4d, bin_size )
> %RANGE_BIN Identifies the index values in r_4d that are within a bin of a given size
> % each radar is done individually and the ouput file is a sparse
> % matrix
> % [ range_index ] = range_bin_sparse( r_4d, bin_size )
>
> %single range bin binning of a single radar
> r_bin_max = ceil((((max(r_4d(:)))))/bin_size);
> r_bin = (bin_size/2:bin_size:r_bin_max*bin_size+bin_size/2);%[m]
> index_range = zeros(length(r_bin),250);
>
> %filter variables
> rH = r_bin + bin_size/2;
> rL = r_bin - bin_size/2;
> %find exceptable values for each range bin
> for v=1:length(r_bin)
> index_temp = find((r_4d) >= rL(v) & (r_4d) <rH(v)); % change here
> index_range(v,1:length(index_temp)) = index_temp;
> end
> index_range = sparse(index_range');
>
> end
>
> % Now here is the data I use to test
> r_4d=peaks(50)+7; % + 7 to make data positive
> bin_size = 0.1;
> [ index_range ] = range_bin_sparse( r_4d, bin_size );
>
> % Here is a way to use HISTC
> nbins = ceil(max(r_4d(:))/bin_size);
> edges = (0:nbins)*bin_size;
> [~,bin] = histc(r_4d,edges);
> bool = sparse(1:numel(bin), bin,true);
>
> % Note the the logical sparse matrix BOOL is not exactly like range, but it contains the
> % same information, if you want to find which index on the 50th bins, you can do this:
>
> >> index_range(:,50)
>
> ans =
>
> (1,1) 525
> (2,1) 526
> (3,1) 527
> (4,1) 528
> (5,1) 580
> (6,1) 826
> (7,1) 830
> (8,1) 1013
> (9,1) 1064
> (10,1) 1206
> (11,1) 1406
> (12,1) 1517
>
> % Or this:
>
> >> find(bool(:,50))
>
> ans =
>
> 525
> 526
> 527
> 528
> 580
> 826
> 830
> 1013
> 1064
> 1206
> 1406
> 1517
>
> % You see that they contain the same thing.
>
> In practice, next calculation does not need the FIND command, because BOOL can be used as matrix for LOGICAL indexing, that's what Matt suggest you. For example figure out values of thet array r_4d() falling the 50th bins
>
> >> r_4d(bool(:,50))
>
> ans =
>
> 4.9948
> 4.9059
> 4.9003
> 4.9737
> 4.9775
> 4.9397
> 4.9658
> 4.9965
> 4.9447
> 4.9068
> 4.9602
> 4.9365
>
> % Check if they fall inside the bins
>
> >> r_4d(bool(:,50))>=edges(50) & r_4d(bool(:,50))<edges(51)
>
> ans =
>
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
>
> >>
>
> % Bruno

From: Anthony Hopf on 9 Jun 2010 10:59

So I think a couple lines fixed the problem:

%%%%%%
bin_size = 100;
r_bin_max = ceil((((nanmax(double(r_4d(:))))))/bin_size);
r_bin = (0:bin_size:r_bin_max*bin_size);%[m]
[null,bin] = histc(r_4d(:),r_bin);
bin(bin==0)=max(bin(:))+1;
bool = sparse(1:numel(bin), (bin),true);
bool = bool(:,1:end-1);

I added the r_4d(:) so I can work with 3d matrices, and I took all the values of bin =0 and changed their value to something outside of the actual r_4d matrix index values and then lopped them off at the end.

Again, Thank you for all of your help. Now to apply this to r,theta, and phi... I think i am going to do this sequentially.

Anthony

"Anthony Hopf" <anthony.hopf(a)gmail.com> wrote in message <huo80j$c5n$1(a)fred.mathworks.com>...
> Thank you Matt and Bruno!! I apologize for not staying with this topic, but I have quite a bit on my plate... and I am taking your advice, and TRYING to learn. You have both been very helpful and I hope you won't give up on me yet!!
>
> The reason I didn't stick with Matt's suggestion is I could never get values into the cells I and J, and then I got pulled from my work (circumstances that I do not wish to bring up).
>
> I am working with your suggestions and trying to apply the code to a 3d matrix, but first I started playing with your code to see what happens if there are NaN values within r and I run into a problem that I thought I could explain, the error I receive is:
>
> ??? Error using ==> sparse
> Index into matrix must be positive.
>
> I realize the problem is that bin will have a 0 values due to the fact there is an out of bounds points... so I thought a remedy would be:
>
> r_4d=peaks(50)+7; % + 7 to make data positive
> r_4d(25:30,25:30) = NaN;
> bin_size = 0.1;
> %[ index_range ] = range_bin_sparse( r_4d, bin_size );
>
> % Here is a way to use HISTC
> nbins = 151;%ceil(max(r_4d(:))/bin_size);
> r_bin = (0:nbins)*bin_size;
> [null,bin] = histc(r_4d,r_bin);
> bool = sparse(1:nnz(bin), nonzeros(bin),true);
>
> As you notice I have modified r_4d(25,25) to be a NaN value and commented out index_range (this line also perplexes me). My final two changes are in the sparse command where I try and compensate for the fact that there are out of bounds points... this seems to mess things up, when I go to a much larger data set with a 3d matrix r_4d the code runs, the number of nnz points in bool matches the number of ~isnan(r_4d) so I know histc is still working... but when I index using bool the returned values are not correct.
>
> for the above code:
> without NaNs:
> bool(:,50)
>
> ans =
>
> (525,1) 1
> (526,1) 1
> (527,1) 1
> (528,1) 1
> (580,1) 1
> (826,1) 1
> (830,1) 1
> (1013,1) 1
> (1064,1) 1
> (1206,1) 1
> (1406,1) 1
> (1517,1) 1
>
> with NaNs:
> bool(:,50)
>
> ans =
>
> (525,1) 1
> (526,1) 1
> (527,1) 1
> (528,1) 1
> (580,1) 1
> (826,1) 1
> (830,1) 1
> (1013,1) 1
> (1064,1) 1
> (1206,1) 1
> (1382,1) 1
> (1481,1) 1
>
> Can you please give me a little more advice on this?
>
> Thank you
>
> "Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <hum8vg$65s$1(a)fred.mathworks.com>...
> > First, this a an month-old thread. I'm sorry but if you haven't not pay enough attention to what has been suggested for month (including Matt's remark about the miss-use of sparse), the a lot time is waste.
> >
> > Now it will be the last time to write something on this topic, then I'll move on. Here we go:
> >
> > First I change your original function by removing the garbage inside, and also change one comparison test (from > to >=) to match HISTC.
> >
> > %%
> > function [ index_range ] = range_bin_sparse( r_4d, bin_size )
> > %RANGE_BIN Identifies the index values in r_4d that are within a bin of a given size
> > % each radar is done individually and the ouput file is a sparse
> > % matrix
> > % [ range_index ] = range_bin_sparse( r_4d, bin_size )
> >
> > %single range bin binning of a single radar
> > r_bin_max = ceil((((max(r_4d(:)))))/bin_size);
> > r_bin = (bin_size/2:bin_size:r_bin_max*bin_size+bin_size/2);%[m]
> > index_range = zeros(length(r_bin),250);
> >
> > %filter variables
> > rH = r_bin + bin_size/2;
> > rL = r_bin - bin_size/2;
> > %find exceptable values for each range bin
> > for v=1:length(r_bin)
> > index_temp = find((r_4d) >= rL(v) & (r_4d) <rH(v)); % change here
> > index_range(v,1:length(index_temp)) = index_temp;
> > end
> > index_range = sparse(index_range');
> >
> > end
> >
> > % Now here is the data I use to test
> > r_4d=peaks(50)+7; % + 7 to make data positive
> > bin_size = 0.1;
> > [ index_range ] = range_bin_sparse( r_4d, bin_size );
> >
> > % Here is a way to use HISTC
> > nbins = ceil(max(r_4d(:))/bin_size);
> > edges = (0:nbins)*bin_size;
> > [~,bin] = histc(r_4d,edges);
> > bool = sparse(1:numel(bin), bin,true);
> >
> > % Note the the logical sparse matrix BOOL is not exactly like range, but it contains the
> > % same information, if you want to find which index on the 50th bins, you can do this:
> >
> > >> index_range(:,50)
> >
> > ans =
> >
> > (1,1) 525
> > (2,1) 526
> > (3,1) 527
> > (4,1) 528
> > (5,1) 580
> > (6,1) 826
> > (7,1) 830
> > (8,1) 1013
> > (9,1) 1064
> > (10,1) 1206
> > (11,1) 1406
> > (12,1) 1517
> >
> > % Or this:
> >
> > >> find(bool(:,50))
> >
> > ans =
> >
> > 525
> > 526
> > 527
> > 528
> > 580
> > 826
> > 830
> > 1013
> > 1064
> > 1206
> > 1406
> > 1517
> >
> > % You see that they contain the same thing.
> >
> > In practice, next calculation does not need the FIND command, because BOOL can be used as matrix for LOGICAL indexing, that's what Matt suggest you. For example figure out values of thet array r_4d() falling the 50th bins
> >
> > >> r_4d(bool(:,50))
> >
> > ans =
> >
> > 4.9948
> > 4.9059
> > 4.9003
> > 4.9737
> > 4.9775
> > 4.9397
> > 4.9658
> > 4.9965
> > 4.9447
> > 4.9068
> > 4.9602
> > 4.9365
> >
> > % Check if they fall inside the bins
> >
> > >> r_4d(bool(:,50))>=edges(50) & r_4d(bool(:,50))<edges(51)
> >
> > ans =
> >
> > 1
> > 1
> > 1
> > 1
> > 1
> > 1
> > 1
> > 1
> > 1
> > 1
> > 1
> > 1
> >
> > >>
> >
> > % Bruno

From: Anthony Hopf on 9 Jun 2010 17:58

I feel like I am sooo close....

My issue is with finding the intersection between 3 sparse logical matrices

idxr, idxa, and idxe are all sparse logical matrices. The r, a and e correspond to range, azimuth and elevation. With the help of Bruno and Matt I find them like this:

r_bin_max = ceil((((nanmax(r_4d(:)))))/bin_size);
r_bin = (bin_size/2:bin_size:r_bin_max*bin_size+bin_size/2);%[m]
az_bin = 0:beamwidth:360;%[deg]
el_bin = 0:beamwidth:thetae_max;%[deg]
%
[null,bin] = histc(r_4d(:),r_bin);
clear null
bin(bin==0)=max(bin(:))+1;
idxr = sparse(1:numel(bin), (bin),true);
idxr = idxr(:,1:end-1);
[null,ebin] = histc(el(:),el_bin);
clear null
ebin(ebin==0)=max(ebin(:))+1;
idxe = sparse(1:numel(ebin), (ebin),true);
idxe = idxe(:,1:end-1);
[null,abin] = histc(az(:),az_bin);
clear null
abin(abin==0)=max(abin(:))+1;
idxa = sparse(1:numel(abin), (abin),true);
idxa = idxa(:,1:end-1);

%% I was going to use histcn but this takes less than 0.5sec to complete with histc, the histc calls take ~0.05sec, a similar histcn call would be ~0.15sec.

This is infinitely faster than the way I was doing it before, but I have the problem of finding the intersection between them to create the index representative of a 3d sampling in r,theta,phi... the way I do it now is:

idxbr =( zeros(size(idxr,2)*size(idxa,2)*size(idxe,2),100));

counter = 0;
for iir = 1:size(idxr,2)
for iia = 1:size(idxa,2)
for iie = 1:size(idxe,2)
idxrae = intersect(find(idxr(:,iir)),intersect(find(idxa(:,iia)),find(idxe(:,iie))));
counter = counter+1;
idxbr(1:length(idxrae),counter) = idxrae;
clear idxrae
end%***end of beam position second loop
end%***end of range first loop
end
idxbr = sparse(idxbr);
end%function end

This not only takes an extremely long time for large idx* arrays, but I loss the excellent logical sparse arrays Bruno and Matt taught me how to make.

if it is any consolation I know everything I need is in rbin, ebin, and abin... I can see it when I print the values out, the bins are right there to use.

Can I get rid of these for loops?

Thanks, Anthony

From: Bruno Luong on 10 Jun 2010 05:28

"Anthony Hopf" <anthony.hopf(a)gmail.com> wrote in message <hup2ps$rmb$1(a)fred.mathworks.com>...

>
> Can I get rid of these for loops?

For multi-dimensional histogram use Fex submission HISTCN.

Bruno

From: Anthony Hopf on 10 Jun 2010 14:58

Bruno or anyone that has looked through the thread,

I'm beating myself up here... I just can't see through to the end. Histcn returns the same thing Histc does is return one output "loc" rather than multiple outputs of "bin." If I cat the bins

[bin1 bin2 bin3] == [ loc ]

Where I hit a wall is making use of loc. Loc has all of my bin locations, which I need and it does it very quickly, but I need to relate each column of "loc" to one another, compare every combination of bin value. This iterative comparison is what slows me down, and what I have been stuck on conceptually, which was what originally slowed down my understanding of your function histcn.

Does anyone know of a function that will compress these for loop after I have used histcn to find the bin values? The for loops step through the bin values of each column of "loc" returning the correct index values of "loc".

%I previously used histcn to create "loc"
idxbr =( zeros(length(azm2)*length(el2)*length(rb2),100));
counter = 0;

for iia = 1:max(loc(:,1)) %stepping through loc(:,1)
for iie = 1:max(loc(:,2)) %stepping through loc(:,2)
for iir = 1:max(loc(:,3)) %stepping through loc(:,3)
counter = counter+1;
tempfind = find(loc2(:,1)==iia & loc2(:,2)==iie & loc2(:,3) == iir); %find all values corresponding to loc = [1 1 1] to loc = [max(loc(:,1)) max(loc(:,2)) max(loc(:,3))] and everything in between.
idxbr(1:length(tempfind),counter) = tempfind(:); %store the indexes
clear tempfind
end
end
end

My buddy does this kind of lookup in SQL databases, but I don't know how to implement it in Matlab... I would assume there is a function, trick, or something I missed in histcn that someone can hint to? It I just had to find all points that sit in bin = 1, 2, 3, .... then this would be very simple, but because I have to look at each combination of bins it makes it tricky.

I realize this is a long thread but I hope someone reads this and can give me a hint!!

It is greatly appreciated.

Anthony

"Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <huqb73$q0d$1(a)fred.mathworks.com>...
> "Anthony Hopf" <anthony.hopf(a)gmail.com> wrote in message <hup2ps$rmb$1(a)fred.mathworks.com>...
>
> >
> > Can I get rid of these for loops?
>
> For multi-dimensional histogram use Fex submission HISTCN.
>
> Bruno

First | Prev | Next | Last
Pages: 1 2 3 4 5 6
Prev: reading a bunch of data files with non-consecutive file names
Next: Simscape>Mechanical Source