Prev: why fftshift() is useful.
Next: Indexing base sanity check - (was ... which [was ... which impliedref to ... ;/
From: Rune Allnor on 27 Jun 2010 14:16 On 27 Jun, 20:06, robert bristow-johnson <r...(a)audioimagination.com> wrote: > On Jun 27, 8:09 am, Rune Allnor <all...(a)tele.ntnu.no> wrote: > > > On 27 Jun, 07:56, eric.jacob...(a)ieee.org (Eric Jacobsen) wrote: > > > > IMHO the most plausible explanation for why this has never been > > > addressed is that a conscious decision has been made within the > > > MathWorks that Matlab should not have flexible or zero-based indexing > > > capability. > > > The reason why it has never been addressed is that MATLAB is > > an acronym for MATrix LABoratory. In linear algebra the numerical > > arrays are - like it or not - indexed base 1. > > whether it's 0 or 1 is not the issue. whether the user can define it > for a particular array *is* the issue. In that case, why can't you just use the OO capabilities of matlab and declare your own array class, say, RBJarray, and overload the () operator? Ought to be a piece of cake, provided matlab's OO is half decent. Rune
From: Walter Roberson on 4 Jul 2010 12:42 robert bristow-johnson wrote: > On Jun 27, 8:09 am, Rune Allnor <all...(a)tele.ntnu.no> wrote: >> Changing this would undermine nearly 30 years worth of code base. > no it wouldn't. it could be perfectly backward compatible, because > newly-created arrays would have default origin of 1 for every > dimension of the array. the user would have to call a not-yet- > existing function to change the origin. Robert, your enthusiasm for this idea has led you to neglect thinking the problems through. Suppose I have a mex file written to the current API. One or more of the arguments to the function are array indices. Now introduce your proposed new indexing scheme and have the user create such an array and pass it in to the existing mex file. Your claim that the proposed change "could be perfectly backwards compatible" implies that the indices the user passes in must not rely on the new indexing scheme, because "perfectly backwards compatible" means that old code must work UNCHANGED. Now *without changing the API for existing routines*, how is Matlab going to know if a (say) 5 being passed in as a numeric value is already adjusted to be 1-relative or needs to be silently re-biased by Matlab to the appropriate basis in order to preserve backwards compatibility? What if the dimension to be indexed is itself is a parameter so that the unbiasing that needs to take place is not constant? What if the dimension number is not an _obvious_ parameter, such as if you had some encoding such as the mesh encoding that mixes control values and data values in the same array? What if the indices that have to be rebiased have been packed, such as two 8-bit indices numerically jammed into a 16 bit number -- how is Matlab going to know the jamming algorithm to know how to rebias and construct the appropriate new index? As long as indices are computable data then in order to support different index biases you MUST break backwards compatibility, in that the existing code would have to be enhanced to know about and take into account the new indexing scheme for any parameter that is not provably an old-style matrix.
From: robert bristow-johnson on 4 Jul 2010 20:43 On Jul 4, 12:42 pm, Walter Roberson <rober...(a)hushmail.com> wrote: > robert bristow-johnson wrote: > > On Jun 27, 8:09 am, Rune Allnor <all...(a)tele.ntnu.no> wrote: > >> Changing this would undermine nearly 30 years worth of code base. > > > > no it wouldn't. it could be perfectly backward compatible, because > > newly-created arrays would have default origin of 1 for every > > dimension of the array. the user would have to call a not-yet- > > existing function to change the origin. > > Robert, your enthusiasm for this idea has led you to neglect thinking > the problems through. that might be a premature judgment. you don't know how much i have thought this through for more than a decade. if you can get Google Groups to search adequately, you might find the places where the objections (similar to yours) were brought up and i swatted them down. even Cleve eventually admitted that it was, from the strict definition of the term, backward compatible. > Suppose I have a mex file written to the current API. as long as no one applies your MEX function (written under the old assumptions) to any array with any origin not 1, there would be no problem. still backward compatible. no one's code breaks. if this extension or enhancement to MATLAB were adopted and you wanted your .mex function to work with arrays of origin different than 1, you would have to modify the .mex function to look for the origins (using the new API). otherwise your function would work as if all of the origins were 1 when they may not be. it's no different than with any other extension or enhancement to a language. the issue you brought up is not a violation of backward compatibility. old code (with old .mex files) would still work the same way they did before. r b-j
From: Steven Lord on 6 Jul 2010 15:38 "robert bristow-johnson" <rbj(a)audioimagination.com> wrote in message news:2f984955-7c2c-4414-8b5b-5843e314a8d7(a)b35g2000yqi.googlegroups.com... > On Jul 4, 12:42 pm, Walter Roberson <rober...(a)hushmail.com> wrote: > > robert bristow-johnson wrote: > > > On Jun 27, 8:09 am, Rune Allnor <all...(a)tele.ntnu.no> wrote: > > >> Changing this would undermine nearly 30 years worth of code base. > > > > > > no it wouldn't. it could be perfectly backward compatible, because > > > newly-created arrays would have default origin of 1 for every > > > dimension of the array. the user would have to call a not-yet- > > > existing function to change the origin. > > > > Robert, your enthusiasm for this idea has led you to neglect thinking > > the problems through. > > that might be a premature judgment. you don't know how much i have > thought this through for more than a decade. if you can get Google > Groups to search adequately, you might find the places where the > objections (similar to yours) were brought up and i swatted them > down. even Cleve eventually admitted that it was, from the strict > definition of the term, backward compatible. > > > Suppose I have a mex file written to the current API. > > as long as no one applies your MEX function (written under the old > assumptions) to any array with any origin not 1, there would be no > problem. still backward compatible. no one's code breaks. Except for the person who DOES apply an older MEX function to a non-1 based array. People often need/want to reuse their old code, and I've found people tend to get a wee bit upset if you tell them "Sorry, don't do that." unless there's a major benefit they can immediately see (and sometimes even then.) While some (including yourself) may see 0-based indexing as such a major benefit, there are also many who would not. > if this extension or enhancement to MATLAB were adopted and you wanted > your .mex function to work with arrays of origin different than 1, you > would have to modify the .mex function to look for the origins (using > the new API). otherwise your function would work as if all of the > origins were 1 when they may not be. And that is an incompatibility. Let's take a look at a few others. 1) Let's say that a user had a function that needs to loop over the columns of their matrix. This is not a contrived example; I've seen plenty of code that does something similar. for whichcolumn = 1:size(A, 2) % process column A(:, whichcolumn) end If A is a 1-based array, then everything works as it did before. If A was a 0-based array, then this would error with an error message like "Index exceeds matrix dimension" or something similar. Code that used to work no longer works. That's an incompatibility. Even worse, let's say the user's code, for whatever reason, only needed to process all but the final column of A. for whichcolumn = 1:size(A, 2)-1 % process column A(:, whichcolumn) end Now this code can SILENTLY GIVE THE WRONG ANSWER if A is a 0-based array, as it will process all but the _first_ column of A. 2) What happens if I SAVE a 0-based array in a MAT-file and LOAD it in a version prior to the introduction of this change? Would you expect that to work? 3) Suppose A is a 0-based array and B is a 1-based array. What happens if I ask for A+B? A.*B? [Assume all the sizes match; just consider the based-ness for this scenario.] I'm guessing you're going to say that A+B and A.*B both error; operating under that assumption, let's take a look at one more example. What should the command pi+A do? Does the pi built-in function return a 0-based array or a 1-based array? Seems to me that users would expect pi+A to add pi to each element of A -- but for backwards compatibility, and to agree with your own proposal from earlier in this thread, PI would _have_ to return a 1-based array and so pi+A would ERROR. If you say that PI should be "smart enough" to know with what it's going to be combined and return the appropriate-indexed scalar: x = pi; y = A+x; z = B+x; _One_ of the latter two operations must error unless scalars are treated as a special case. If they are treated specially, x = 1:10; will encounter the same problem assuming A and B are both 10-element vectors, since x must have a specified base when it is created. > it's no different than with any other extension or enhancement to a > language. the issue you brought up is not a violation of backward > compatibility. old code (with old .mex files) would still work the > same way they did before. Robert, when you've brought this type of system up in the past I've seriously thought about how it could work, but based on the potential problems I called out above, speaking personally I do not think it would be a good idea to modify MATLAB indexing to use your proposed system. I think your best option would be to create your own 0-based (or variable-based) object and overload those functions with which you want to work as well as subscripted indexing and assignment -- that way you can control the behavior of indexing for your object. You can even make use of the built-in functions (rather than having to reimplement them) by using the BUILTIN function to call them on the 1-based data that's stored inside the array as a private data member, and adjust the indices afterward (if necessary.) Indeed, Nabeel posted an object to do just this to the File Exchange several years ago: http://www.mathworks.com/matlabcentral/fileexchange/1168-varbase If you wanted to write a classdef-based version of this object then the Object-Oriented Programming section of the documentation will contain the information you'll need to become familiar with this style of object. In particular, since indexing will be a major component of this object, the following page will be of special interest: http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_oop/br09eqz.html -- Steve Lord slord(a)mathworks.com comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ To contact Technical Support use the Contact Us link on http://www.mathworks.com
From: robert bristow-johnson on 6 Jul 2010 18:19
On Jul 6, 3:38 pm, "Steven Lord" <sl...(a)mathworks.com> wrote: > "robert bristow-johnson" <r...(a)audioimagination.com> wrote in message > news:2f984955-7c2c-4414-8b5b-5843e314a8d7(a)b35g2000yqi.googlegroups.com... > > > > > On Jul 4, 12:42 pm, Walter Roberson <rober...(a)hushmail.com> wrote: > > > robert bristow-johnson wrote: > > > > On Jun 27, 8:09 am, Rune Allnor <all...(a)tele.ntnu.no> wrote: > > > >> Changing this would undermine nearly 30 years worth of code base. > > > > > no it wouldn't. it could be perfectly backward compatible, because > > > > newly-created arrays would have default origin of 1 for every > > > > dimension of the array. the user would have to call a not-yet- > > > > existing function to change the origin. > > > > Robert, your enthusiasm for this idea has led you to neglect thinking > > > the problems through. > > > that might be a premature judgment. you don't know how much i have > > thought this through for more than a decade. if you can get Google > > Groups to search adequately, you might find the places where the > > objections (similar to yours) were brought up and i swatted them > > down. even Cleve eventually admitted that it was, from the strict > > definition of the term, backward compatible. > > > > Suppose I have a mex file written to the current API. > > > as long as no one applies your MEX function (written under the old > > assumptions) to any array with any origin not 1, there would be no > > problem. still backward compatible. no one's code breaks. > > Except for the person who DOES apply an older MEX function to a non-1 based > array. the definition of "backward compatible" that i am using is (from Wikipedia): "a product or a technology is said to be backward compatible when it is able to fully take the place of an older product... Backward compatibility is a relationship between two components, rather than being an attribute of just one of them. More generally, a new component is said to be backward compatible if it provides all of the functionality of the old component." what i am and had been proposing for a decade is backward compatible in that meaning. if people try to use the "new feature" (non-1 based arrays) and misuse them and get error messages, that does not mean that it fails backward compatibility. if a MEX function was unaware of this (and old ones *would* be unaware), it would treat any array argument as if it had origin 1 for every dimension. it would be wrong, but that would be a misuse (to use a function never written to deal with other origins on a non-1 based array). and nothing would blow up. > Robert, when you've brought this type of system up in the past I've > seriously thought about how it could work, but based on the potential > problems I called out above, speaking personally I do not think it would be > a good idea to modify MATLAB indexing to use your proposed system. > > I think your best option would be to create your own 0-based (or > variable-based) object and overload those functions with which you want to > work as well as subscripted indexing and assignment -- that way you can > control the behavior of indexing for your object. here's the deal: i am a DSP person, not an OOP person. MATLAB has marketed itself (falsely) making claims (in the v4 and v5 user manuals) as: "MATLAB integrates numerical analysis, matrix computation, signal processing, and graphics in an easy-to-use environment where problems and solutions are expressed just as they are written mathematically - ... " "MATLAB is a high-performance language for technical computing. It integrates computation, visualization, and programming in an easy-to- use environment where problems and solutions are expressed in familiar mathematical notation." note the phrases in claims: "familiar mathematical notation" and "just as they are written mathematically". i submit that those claims are false in a sense that is salient particularly for those of use who use MATLAB for (digital) signal processing. and i suspect the claims are false for some other users that also deal with data that are naturally sequenced with subscripts or indices that are non-positive. now, what i want is for MATLAB to live up to that. MATLAB is not C++ where, if i complain that C doesn't give me a complex variable type, the first response from someone is that i should use C++ and a library with a complex class. MATLAB *does* have an OOP capability and with those two subscript calls (i think they're called subsref() and subsasgn()) to make it possible to insert a shim that intercepts the indices. so, what you say Steven is true, a class can be done, but what i need to use MATLAB (or Octave) for is to do DSP. i should not have to become an OOPs programmer just to do basic DSP with equations that are recognizable in the DSP lit. i remember that Thomas Krauss once told me (a decade ago) that he brought this up early before the Sig Proc Toolbox (the very first MATLAB toolbox) was completed and released. this is really when you guys should have fixed the problem. you have forced people to adopt non-standard non-natural indexing for the problems they work on. besides the FFT (where it should be obvious) i think TMW blew it with the definitions of polyval() and related functions like polyfit(). they should have used 0-based arrays of coefficients with a(n) being the coefficient for x^n. similarly TMW also screwed up the definitions of conv() and related functions. when polynomials are multiplied to each other, we know it's like their coefficient sequences are convolved and the FFT can be used to convolve large coefficient sequences. but the order of coefficients (and the index values associated) are just plain wrong in MATLAB. besides missing the proper 0-based counting, TMW put the order wrong. > You can even make use of > the built-in functions (rather than having to reimplement them) by using the > BUILTIN function to call them on the 1-based data that's stored inside the > array as a private data member, and adjust the indices afterward (if > necessary.) if we were to do it that way, what the object converter (i would call it "rebase" or "reorigin") should do, is check to see if the origins for all dimensions in the output array are all 1, then it should return a regular MATLAB matrix variable and not one of this "rebase" type) so that all of the existing MATLAB functions can work on this 1- based array. > Indeed, Nabeel posted an object to do just this to the File > Exchange several years ago: i remember. and i remember i was unable to use it (because much more needed to be done). > http://www.mathworks.com/matlabcentral/fileexchange/1168-varbase > > If you wanted to write a classdef-based version of this object then the > Object-Oriented Programming section of the documentation will contain the > information you'll need to become familiar with this style of object. In > particular, since indexing will be a major component of this object, the > following page will be of special interest: > > http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_oop/br09... again, you're requiring me to be an OOPs programmer when what i need MATLAB for is to do signal processing. this is something *you*guys* should fix. i would be happy to help in specification (and i had and i'll dig up the old text) but this is a deficit in MATLAB and it's sorta non-responsible for TMW to require users to fix such deficits in their product. that varbase class should be renamed "reorigin" and, to begin with, the following operators should be overloaded according to the following spec, and if a resulting "reorigin" object happens to have only 1 for the origins of every dimension, then a regular 1-based MATLAB variable should be returned. i know it's a little bit to read, but i would ask that you read it. to make that class work, this is the minimum that is needed to make it work usefully and correctly (and backward compatible). i am, of course, guessing at what the C-like date structure is for a MATLAB variable, but whatever it is, the pointer to the origin[] vector should be appended to the end or use an existing *unused* field. then it would be backward compatible with MEX files. r b-j ________________________________________________ enum MATLAB_class {text, real, complex}; // I don't wanna cloud the issue considering other classes. typedef struct { void* data; // pointer to actual array data char* name; // pointer to the variable's name enum MATLAB_class type; // class of MATLAB variable (real, complex,...) int num_dimensions; // number of array dimensions >= 2 long* size; // points to a vector with the number of rows, columns, etc. } MATLAB_variable; char name[32]; // suppose MATLAB names are unique to 31 chars long size[num_dimensions]; if (type == text) { char data[size[0]*size[1]*...*size[num_dimensions-1]]; } else if (type == real) { double data[size[0]*size[1]*...*size[num_dimensions-1]; } else if (type == complex) { double data[size[0]*size[1]*...*size[num_dimensions-1][2]; } The above is sorta C-like pseudocode. I'm writing it as if the declarations can allocated like malloc() does. Currently, when an element, A(n,k), of a 2 dimensional MATLAB array A is accessed, first n and k are confirmed to be integer value (not a problem in C), then confirmed to be at least 1 and less than or equal to size[0] and size[1], respectively. It those constraints are satisfied, the value of that element is accessed as: data[(n-1)*size[0] + (k-1)]; For a 3 dimensional array, A(m,n,k), it would be the same but now: data[((m-1)*size[1] + (n-1))*size[0] + (k-1)]; What is proposed is to first add a new member to the MATLAB variable structure called "origin" which is a vector of the very same length (num_dimensions) as the "size" vector. The default value for all elements of the origin[] vector would be 1 with only the exceptions outlined below. This is what makes this backwards compatible, in the strictest sense of the term. typedef struct { void* data; char* name; enum MATLAB_class type; int num_dimensions; long* size; long* origin; // points to a vector with index origin for each dimension } MATLAB_variable; char name[32]; long size[num_dimensions]; long origin[num_dimensions]; Now before each index is used, it is checked against the bounds for that dimension ( origin[dim] <= index < size[dim]+origin[dim] where 0 <= dim < num_dimensions), Since the default for origin[dim] is 1, this will have no effect, save for the teeny amount of processing time need to look up the origin, on existing MATLAB legacy code. So to access a single element of an array A, this C array index would look like: data[(n-origin[1])*size[0] + (k-origin[0])]; For a 3 dimensional array, A(m,n,k), it would look like: data[((m-origin[2])*size[1] + (n-origin[1])*size[0] + (k-origin[0])]; Okay, how someone like myself would use this to do something different is that there would be at least two new MATLAB facilities similar to size() and reshape() that I might call "origin()" and "reorigin()", respectively. Just like MATLAB size() function returns the contents of the size[] vector, origin() would return, in MATLAB format, the contents of the origin[] vector. And just like reshape() changes (under proper conditions) the contents of the size[] vector, reorigin() would change the contents of the origin[] vector. Since reorigin() does not exist in legacy MATLAB code (oh, I suppose someone could have created a function named that, but that's a naming problem that need not be considered here), then there is no way for existing MATLAB programs to change the origins from their default values of 1 making this fix perfectly backward compatible. Now, just as there are dimension compatibility rules that exist now for MATLAB operations, there would be a few natural rules that would be added so that "reorigined" MATLAB arrays could have operations applied to them in a sensible way. ARRAY ADDITION ("+") and SUBTRACTION ("-") and element-by-element ARRAY MULTIPLICATION (".*"), DIVISION ("./"), POWER (".^"), and ELEMENTARY FUNCTIONS: Currently MATLAB insists that the number of dimensions are equal and the size of each dimension are equal (that is the same "shape") before adding or subtracting matrices or arrays. The one exception to that is adding a scaler to an array, in which a hypothetical array of equal size and shape with all elements equal to the scaler, is added to the array. The resulting array has the same size and shape as the argument arrays. The proposed system would, of course, continue this constraint and add a new constraint in that index origins for each dimension (the origin[] vector) would have to be equal for two arrays to be added or subtracted or element- by-element multiplied, etc. The resulting array would have the same shape and origin[] vector as the input arrays. "MATRIX" MULTIPLICATION ("*"): A = B*C; Currently MATLAB appropriately insists that the number of columns of B are equal to the number of rows of C (we shall call that number K). The resulting array has the number of rows of B and the number of columns of C. The value of a particular element of A would be: K A(m,n) = SUM{ B(m,k) * C(k,n) } k=1 The proposed system would, of course, continue this constraint and add a new constraint in that index origins must be equal for each dimension where the lengths must be equal. That is the number of columns of B are equal to the number of rows of C and the origin index of the columns of B are equal to the origin index of the rows of C. The resulting array has the number of rows of B and the number of columns of C and the origin index of the rows of B and the origin index of the columns of C. The value of a particular element of A would be: org-1+K A(m,n) = SUM{ B(m,k) * C(k,n) } k=org where org = origin[0] for the B array and origin[1] for the C array which must be the same number. In the same manner as the present MATLAB K would be size[0] for the B array and size[1] for the C array and would have to be the same number, otherwize a diagnostic error would be returned. Both of these definitions are degenerations of the more general case where: +inf A(m,n) = SUM{ B(m,k) * C(k,n) } k=-inf where here you consider B and C to be zero-extended to infinity in all four directions (up, down, left, and right). It's just that the zero element pairs do not have to be multiplied and summed. Matrix powers and exponentials (on square matrices) can be defined to be consistent with this extension of the matrix multiply. MATRIX DIVISION ("/" and "\"): Like the current matrix division, it would invert the operation of A = B*C that is C = B\A and B = A/C The same size (or shape) requirements and index origin requirements of the "*" operator would apply to "\" and "/". Given A and B, whatever shape and origin requirement for B and C and whatever shape and origin returned in A in the statement: A = B*C; would be the same requirements for B and A and define the resulting shape and origin for C in the statement: C = B\A CONCATINATION: This would also be a simple and straight-forward extension of how MATLAB presently concatinates arrays. When we say: A = [B C]; The number of rows of B and C must be equal, but the number of columns of B and C can be anything. The first columns of A are identical with the columns of B and then also must the indices of those columns. And independent of what the column indices of C are, they just pick up where the column index of B left off. This rule extension defaults to what MATLAB presently does if B and C are both arrays with origin 1. A similar rule extension can be made for A = [B ; C]; In all cases the upper left corner of A is identical to the upper left corner of B, both in value but also in subscripts (so A(1,1) becomes B(1,1) just like it does now in MATLAB). FUNCTIONS THAT RETURN INDICES (min(), max(), find(), sort(), ind2sub(), and any others that I don't know about): It must be internally consistent (and certainly can be made to be). The indices returned would be exactly like the 1-based indices returned presently in MATLAB except that the origin for the corresponding dimension (that defaults to 1) would be added to each C-like index. That is, just like now in MATLAB: [max_value, max_index] = max(A); This must mean that A(max_index) is equal to max_value. I think that this is easy enough to define. The only hard part is to identify all MATLAB functions that search through an array and modify them to start and end at indices that might be different than 1 and size[dim] as are the search bounds today. It would instead search from origin[dim] to size[dim]+origin[dim]-1 which would default to the current operation if the origin equals 1. FOR ALL OTHER MATLAB OPERATIONS, until a reasonable extended definition for arrays with origins not 1 is thought up, MATLAB could either bomb out with an illegal operation error if the base is not 1 or could, perhaps, ignore the origin. Either way, it's still backwards compatible. |