Prev: reading HDF5 dataset subset
Next: String operation
From: Aaron Schurger on 5 Mar 2010 10:11 Dear Loren, The burning question I have is, why not just allow the programmer to force call-by-reference is desired? I can't think of a good reason not to have this feature in the language (although you may know of the reason). I am constantly up against this problem, and constantly getting "out of memory errors" in spite of my thorough knowledge of how matlab tries to intelligently handle the allocation of memory in the local context of a function. Sometimes you just want to call by reference, and no amount cleverness on the part of the programmer or MatLab will help. I have a cell array with many large matrices in it (signal data). I have a function that goes through the cell array one element at a time and applies a lowpass filter to the matrix. MatLab is smart enough to make a local copy of each element of the cell array only when it is modified (in this case when the filter is applied). So the programs starts to iterate through the elements of the cell array, and then at a certain point... "out of memory". In this case, I'd rather have the out of memory error as soon as I call the function since eventually it is guaranteed to halt (a case where smart copying does not pay off). I could iterate through the cell array at top level, but that defeats the purpose since the idea was to encapsulate the processing (which has a few other steps) in a function. "My kingdom for a pointer!" Is there any chance that a future version of MatLab will offer call by reference? I would certainly like to have it! Thanks, Aaron Schurger Loren Shure <loren(a)mathworks.com> wrote in message <MPG.219e667db96ef9889897e6(a)news.mathworks.com>... > In article <muy4pfwlrvb.fsf(a)G99-Boettcher.llan.ll.mit.edu>, > boettcher(a)ll.mit.edu says... > > "Philipp " <philipp.maurer(a)ocag.ch> writes: > > > > > 1) I want a function that operates and changes a data > > > structure. For example, lets have a cell array <map> > > > storing mapEntry structs which each struct having a .key > > > and .value entry. Now I want to write a function that adds > > > a key/value pair to the map cell array if it is not there > > > otherwise add it. In Matlab the function call would be like: > > > map = {}; > > > map = addEntry( map, 'foo', 'test' ) > > > Now the question arises, what happens if map gets very > > > large ? As far as I understood Matlab, there are now three > > > copy operations involved. (two there and one in the > > > functions to copy the map coming in to the map going out). > > > > Copying only happens when you modify a piece of data. So if addEntry > > had only a line that returned map, there would be NO data copying. In > > each case, a smart reference would happen instead. > > > > When you DO modify an array, that is when it is "unshared", and the > > allocation and copy occurs. But even here, the copy is not necessarily > > a "deep" copy. If you modify a container object (a cell or a struct), > > only the top level list of pointers is copied. Each "contained" object > > is now shared between two cell arrays. When you modify one of those > > objects, its top level is copied. And so on. > > > > So in your case, adding a single cell element to map, there is exactly > > one allocation and one memcpy. This memcpy is exactly > > length(map)*sizeof(void*), that is, one pointer for every element in the > > cell array. > > > > Then the MATLAB accelerator comes into play. It is able to optimize > > some functions of the form you just listed (x = fun(x)), by performing > > the operation in place. I don't know exactly what conditions the > > function must meet for this to happen. But certainly it won't work if > > you are expanding the size of your variable. You should definitely be > > preallocating map to be big enough, and keep track of the number of > > current entries with another variable. > > > > > 2) I have two lists, lets say a list of bananas and a list > > > of juices. Every banana has certain attributes and every > > > juice as well. More than that, every juice has a list of > > > bananas it is made of. I see only one possibility: every > > > banana is given a unique ID and every juice stores a list > > > of these IDs. When I have a juice now and want to access > > > the attributes of its bananas, I get the ID and then I have > > > to search my banana list for this ID. See, this is very > > > slow. I want the juice to refer to the banana (like C > > > pointers or Java references). > > > > Instead of storing a list of unique banana identifiers and searching for > > them, store a list of banana indices. This will end up > > (algorithmically, anyway) about as complex as a pointer list. > > > > for i=1:length(juice(j).banana_list) > > banana_idx = juice(j).banana_list(i); > > current_banana = bananas(banana_idx); > > end > > > > > > -Peter > > > > also consider using nested functions for some of your purposes. There > is a single instance of the data available to the nested function > workspace and it can be updated by calling the nested function (many > examples in the doc). > > also, look at my blog (reference below) to see discussions of memory > copying. > > -- > Loren > http://blogs.mathworks.com/loren/
From: Matt J on 5 Mar 2010 10:36 "Aaron Schurger" <aaron.schurger_removeThisText(a)gmail.com> wrote in message <hmr6ut$2fg$1(a)fred.mathworks.com>... "My kingdom for a pointer!" Is there any chance that a future version of MatLab will offer call by reference? I would certainly like to have it! ======= It already does, although it requires a modest amount of wrapping. Essentially, what you need to do is make yourself a generic handle class as follows classdef myClass < handle properties data; end end Now you can create an object to hold your cell array data as in the following example >> ob=myClass; ob.data={1,2,3}; >> ob.data ans = [1] [2] [3] Now you can make changes to ob.data purely by reference, as illustrated next >> test(ob); ob.data ans = [2] [4] [6] where I have defined test.m as follows function test(ob) ob.data=cellfun(@(a)2*a, ob.data,'uni',0); %double all cell elements end
From: Matt J on 5 Mar 2010 10:51 "Matt J " <mattjacREMOVE(a)THISieee.spam> wrote in message <hmr8ds$7uh$1(a)fred.mathworks.com>... > > function test(ob) > > ob.data=cellfun(@(a)2*a, ob.data,'uni',0); %double all cell elements > > end Sorry, bad example. You would need to implement test.m this way, to avoid a complete new copy of ob.data inside its workspace function test(ob) for ii=1:length(ob.data) ob.data{ii}=ob.data{ii}*2; end end Anyway, the point is, the copy of ob.data being manipulated inside test.m is exactly the one in your base workspace. So, you wouldn't get an out-of-memory error in larger sized examples.
|
Pages: 1 Prev: reading HDF5 dataset subset Next: String operation |