From: Jan Simon on 6 Jan 2010 14:50 Dear Matt! > > [foo,foo,foo,goo,goo]=f(); > I know you use this approach, as I do. I was asking why other folks prefer to create a dummy variable in the workspace. The dummy is created with this also, but immediately overwritten! If I have a giantic vector, which nearly fills my memory, and I want to sort it with: [dummy, index] = sort(Array); or [index, index] = sort(Array); in both cases the array is created (any different opinions?)! In the case of cell strings, the sorted array contains shared data copied - fortuantely. But if I never need the sorted array, it would be nice to have a SORTIND, which replies just the index vector. Is anybody willing to publish a MEX wrapper for a quicksort??? That [index] and [index] have the same address is not really surprising. It is the same variable. You do not need a FORMAT DEBUG for that. I failed: Matlab 6.5 has not a sort.c, but sortcellchar.c. This exists in Matlab 7.8 also, but is not documented. Lukily this solves my question for cell strings. The MEX function sortrowsc.c is not an alternative, because it is 6 times slower than SORT if applied to a single column (why?!). If I show "[index, index] = sort(Array)" to a Matlab beginner, I have to explain, that I assume, that the output arguments of a function are assigned from the left to the right. If I show "[dummy, index] = sort(Array)", I do not have to explain anything. Kind regards, Jan BTW. Has the OP tested my try to solve the problem faster than UNIQUE/HISTC/ISMEMBC ?
From: us on 6 Jan 2010 16:01 "Bruno Luong" <b.luong(a)fogale.findmycountry> wrote in message <hi2kg4$b73$1(a)fred.mathworks.com>... > "Matt Fig" <spamanon(a)yahoo.com> wrote in message <hi2im0$fbd$1(a)fred.mathworks.com>... > > "us " <us(a)neurol.unizh.ch> wrote in message > > > this has been shown many a times in CSSM... > > > - i'll always use > > > [foo,foo,foo,goo,goo]=f(); > > > approach > > > > > > I know you use this approach, as I do. I was asking why other folks prefer to create a dummy variable in the workspace. > > Matt and us, I for once prefer the "dummy" approach (followed by a "clear" statement). It's just more readable to me (and I force myself to use different names for different variables). Any eloquence argument to convince me otherwise? > > Bruno bruno - NO: here we go... - i know: EVAL(!)... and nested TRY/CATCH... - but YOU will easily understand... - wintel sys ic2/2*2.6ghz/2gb/winxp.sp3.32/r2009b... % create FOO.M function foo(varargin) % 1) subroutine GOO returns 2 vars in diff memory locations % 2) fill FOO WS with 2 vars % 3) if memory overflows... % 4) clear all WS vars % 5) fill FOO WS with 1 var at same memory address nt=10^8; % <- #var to create... if ~nargin n=1000000; %#ok (r2009b) else n=varargin{1}; %#ok (r2009b) end % fill FOO WS with individual var A_xxx/B_xxx try for i=1:nt com=sprintf('[a_%d,b_%d]=goo(n);',i,i); eval(com); end catch %#ok w=whos('b*'); % <- only count B_xxx disp(sprintf('ERROR at index %5d %5d',i,numel(w))); clear a* b*; % <- CLEAR VARS % fill FOO WS with individual var B_xxx try for i=1:nt com=sprintf('[b_%d,b_%d]=goo(n);',i,i); eval(com); end catch %#ok w=whos('b*'); % <- only count B_xxx disp(sprintf('ERROR at index %5d %5d',i,numel(w))); end end end function [a,b]=goo(varargin) %#ok a=zeros(1,varargin{1},'double'); b=a; b(1)=b(1)+1; % <- allocate new memory... end % at the command prompt clear all; % <- !!!!! foo(100000) %{ ERROR at index 214 213 ERROR at index 427 426 %} foo(1000000) %{ ERROR at index 17 16 ERROR at index 33 32 %} hence, [DUMMY,VAR] is taxing the WS, whilst [VAR,VAR] is NOT (as much)... just a thought... urs
From: Jan Simon on 6 Jan 2010 16:07 Dear Bruno! > Jan, how much do you estimate an inplace sorting (e.g., on large double array) would save time? It is clear, that the creation of the sort index is the demanding part of SORT, while the copy of the input in sorted order is secondary - usually. Unfortuantely one of the 2 DIMM ports of my computer is damaged and I have to live with 512MB RAM. But saving temporarily used memory is always useful, even on a 16 GB machine. Nevertheless, timing matters for 8MB array already: For the estimation of the speed gain: x = rand(1e6, 1); tic; y = sort(x); toc ==> 0.26 sec tic; [y, s] = sort(x); toc ==> 0.40 sec tic; y = x(s); toc ==> 0.14 sec So I assume I could save 35% computing time. I assume, that SORT sorts the values inplace and the sorting index is created simultaneously, if 2 outputs are used. Sorting a INT16 array is much faster: x = uint16(x * 32000); tic; y = sort(x); toc ==> 0.16 sec tic; [y, s] = sort(x); toc ==> 0.31 sec Creating the sorting index needs additional 0.15 sec as in the DOUBLE case above. If the replied index vector could be an UINT32 array, this would be even nicer: ints = uint32(s); tic; y = x(ints); toc ==> 0.09 sec (instead of 0.14 sec) > I'm quite happy with the performance of matlab SORT, except I wish to have a sorting routine where the comparison operator can be customized - but I guess such feature is very inefficient in Matlab due to overhead. SORT is really fine, you are right! But this is not a reason to avoid improving it. If you would create a MEX, which replies the sorting index (perhaps in a chosable type), which uses the standard comparison or optionally a user-defined operator, which can compete in speed with single-output SORT -- *I* would download it! Promised. I assume, calling a user-defined operator through mexCallMATLAB would be a great brake. Kind regards, Jan
From: us on 6 Jan 2010 16:18 "Jan Simon" <matlab.THIS_YEAR(a)nMINUSsimon.de> wrote in message <hi2phr$5nj$1(a)fred.mathworks.com>... > Dear Matt! > > > > [foo,foo,foo,goo,goo]=f(); > > I know you use this approach, as I do. I was asking why other folks prefer to create a dummy variable in the workspace. > If I have a giantic vector, which nearly fills my memory, and I want to sort it with: > [dummy, index] = sort(Array); > or > [index, index] = sort(Array); > in both cases the array is created (any different opinions?)! yes, see my reply (including FOO test program) to bl... the point is - if INDEX|1,2 is very large within the function, it will fail... - if INDEX|1,2 is large and adds to the callers WS, it will fail a bit later... > That [index] and [index] have the same address is not really surprising. It is the same variable. You do not need a FORMAT DEBUG for that. no, of course, but it is nice to convince some ML agnostics by sheer command window output... > If I show "[index, index] = sort(Array)" to a Matlab beginner, I have to explain, that I assume, that the output arguments of a function are assigned from the left to the right. > If I show "[dummy, index] = sort(Array)", I do not have to explain anything. well... NO mercy on this one: i know you've (probably) tried to explain to a C-novice an inscrutable pointer-construct... they just have to learn... SO - old CSSMers will stick with the [foo,foo,foo,goo,goo]=f(); syntax... :-) urs
From: Bruno Luong on 6 Jan 2010 16:25
But us, isn't the test unfair when the variable a_i is not properly cleared? I add the "clear" command and both syntax fails at the same places (see the new foo below). %%%%%%% function foo(varargin) % 1) subroutine GOO returns 2 vars in diff memory locations % 2) fill FOO WS with 2 vars % 3) if memory overflows... % 4) clear all WS vars % 5) fill FOO WS with 1 var at same memory address nt=10^8; % <- #var to create... if ~nargin n=1000000; %#ok (r2009b) else n=varargin{1}; %#ok (r2009b) end % fill FOO WS with individual var A_xxx/B_xxx try for i=1:nt com=sprintf('[a_%d,b_%d]=goo(n);',i,i); eval(com); % Add by Bruno com=sprintf('clear a_%d',i); eval(com); end catch %#ok w=whos('b*'); % <- only count B_xxx disp(sprintf('ERROR at index %5d %5d',i,numel(w))); clear a* b*; % <- CLEAR VARS % fill FOO WS with individual var B_xxx try for i=1:nt com=sprintf('[b_%d,b_%d]=goo(n);',i,i); eval(com); end catch %#ok w=whos('b*'); % <- only count B_xxx disp(sprintf('ERROR at index %5d %5d',i,numel(w))); end end end function [a,b]=goo(varargin) %#ok a=zeros(1,varargin{1},'double'); b=a; b(1)=b(1)+1; % <- allocate new memory... end % Command line >> foo(1000000) ERROR at index 176 175 ERROR at index 176 175 >> foo(10000000) ERROR at index 16 15 ERROR at index 16 15 >> |