Prev: Possible to store results as a matrix?
Next: Saving/Loading The Workspace From a Structure Within .mat
From: robert bristow-johnson on 7 Jul 2010 13:20 because i copied some of this stuff from a decade old post, and evidently, Google Groups has changed the number of characters per line and we cannot control word wrapping in our posts in Google Groups, i have edited the line endings (and we'll let the lines wrap naturally). this should be much more readable than my previous post. Steven Lord and others at TMW, can you *please* read this carefully, with enough of an open mind, and *then* respond with questions, legit criticisms, suggestions and fixes. and please be straight with us about the definition of "backward compatible". if *old* code works with this, (without making use of any calls to "reorigin()" or "rebase()" or whatever it would be named), it *is* "backward compatible", no? ________________________________________________________________ (here are the quotes from old MATLAB manuals that *should* be the uncompromized vision of the use of the product. just because TMW no longer prints these exact statements in their current manuals does not remove that vision for what MATLAB should be.) "MATLAB integrates numerical analysis, matrix computation, signal processing, and graphics in an easy-to-use environment where problems and solutions are expressed just as they are written mathematically..." "MATLAB is a high-performance language for technical computing. It integrates computation, visualization, and programming in an easy-to- use environment where problems and solutions are expressed in familiar mathematical notation." note the phrases in claims: "familiar mathematical notation" and "just as they are written mathematically". i submit that those claims are false in a sense that is salient particularly for those of use who use MATLAB for (digital) signal processing. and i suspect the claims are false for some other users that also deal with data that are naturally sequenced with subscripts or indices that are non-positive. ________________________________________________________________ (this is the foundation of the solution to this indexing problem that i am asking TMW people to read carefully and not reject due to red herrings.) enum MATLAB_class {text, real, complex}; // I don't wanna cloud the issue considering other classes. typedef struct { void* data; // pointer to actual array data char* name; // pointer to the variable's name enum MATLAB_class type; // class of MATLAB variable (real, etc. int num_dimensions; // number of array dimensions >= 2 long* size; // vector with the number of rows, columns, etc. } MATLAB_variable; char name[32]; // suppose MATLAB names are unique to 31 chars long size[num_dimensions]; if (type == text) { char data[size[0]*size[1]*...*size[num_dimensions-1]]; } else if (type == real) { double data[size[0]*size[1]*...*size[num_dimensions-1]; } else if (type == complex) { double data[size[0]*size[1]*...*size[num_dimensions-1][2]; } The above is sorta C-like pseudocode. I'm writing it as if the declarations can allocated like malloc() does. Currently, when an element, A(n,k), of a 2-dimensional MATLAB array A is accessed, first n and k are confirmed to be integer value (not a problem in C), then confirmed to be at least 1 and less than or equal to size[0] and size[1], respectively. If those constraints are satisfied, the value of that element is accessed as: data[(n-1)*size[0] + (k-1)]; For a 3 dimensional array, A(m,n,k), it would be the same but now: data[((m-1)*size[1] + (n-1))*size[0] + (k-1)]; What is proposed is to first add a new member to the MATLAB variable structure called "origin" which is a vector of the very same length (num_dimensions) as the "size" vector. The default value for all elements of the origin[] vector would be 1 with only the exceptions outlined below. This is what makes this backwards compatible, in the strictest sense of the term. typedef struct { void* data; char* name; enum MATLAB_class type; int num_dimensions; long* size; long* origin; // points to a vector with origin for each dimension } MATLAB_variable; char name[32]; long size[num_dimensions]; long origin[num_dimensions]; Now before each index is used, it is checked against the bounds for that dimension ( origin[dim] <= index < size[dim]+origin[dim] where 0 <= dim < num_dimensions). Since the default for origin[dim] is 1, this will have no effect, save for the teeny amount of processing time need to look up the origin, on existing MATLAB legacy code. So to access a single element of an array, A(n,k), this C array index would look like: data[(n-origin[1])*size[0] + (k-origin[0])]; For a 3 dimensional array, A(m,n,k), it would look like: data[((m-origin[2])*size[1] + (n-origin[1])*size[0] + (k- origin[0])]; Okay, how someone like myself would use this to do something different is that there would be at least two new MATLAB facilities similar to size() and reshape() that I might call "origin()" and "reorigin()", respectively. Just like the MATLAB size() function returns the contents of the size[] vector, origin() would return, in MATLAB format, the contents of the origin[] vector. And just like reshape() changes (under proper conditions) the contents of the size[] vector, reorigin() would change the contents of the origin[] vector. Since reorigin() does not exist in legacy MATLAB code (oh, I suppose someone could have created a function named that, but that's a naming problem that need not be considered here), then there is no way for existing MATLAB programs to change the origins from their default values of 1 making this fix perfectly backward compatible. Now, just as there are dimension compatibility rules that exist now for MATLAB operations, there would be a few natural rules that would be added so that "reorigined" MATLAB arrays could have operations applied to them in a sensible way. ARRAY ADDITION ("+") and SUBTRACTION ("-") and element-by-element ARRAY MULTIPLICATION (".*"), DIVISION ("./"), POWER (".^"), and ELEMENTARY FUNCTIONS: Currently MATLAB insists that the number of dimensions are equal and the size of each dimension are equal (that is the same "shape") before adding or subtracting matrices or arrays. The one exception to that is adding a scaler to an array, in which a hypothetical array of equal size and shape with all elements equal to the scaler, is added to the array. The resulting array has the same size and shape as the argument arrays. The proposed system would, of course, continue this constraint and add a new constraint in that index origins for each dimension (the origin[] vector) would have to be equal for two arrays to be added or subtracted or element-by-element multiplied, etc. The resulting array would have the same size[] vector and origin[] vector as the input arrays. "MATRIX" MULTIPLICATION ("*"): A = B*C; Currently MATLAB appropriately insists that the number of columns of B are equal to the number of rows of C (we shall call that number K). The resulting array has the number of rows of B and the number of columns of C. The value of a particular element of A would be: K A(m,n) = SUM{ B(m,k) * C(k,n) } k=1 The proposed system would, of course, continue this constraint and add a new constraint in that index origins must be equal for each dimension where the lengths must be equal. That is the number of columns of B are equal to the number of rows of C and the origin index of the columns of B is equal to the origin index of the rows of C. The resulting array, A, has the number of rows of B and the number of columns of C and the origin index of the rows of B and the origin index of the columns of C would also be the same as for A. The value of a particular element of A would be: org-1+K A(m,n) = SUM{ B(m,k) * C(k,n) } k=org where org = origin[0] for the B array and origin[1] for the C array which must be the same number. In the same manner as the present MATLAB K would be size[0] for the B array and size[1] for the C array and would have to be the same number, otherwise a diagnostic error would be returned. Both of these definitions are degenerations of the more general case where: +inf A(m,n) = SUM{ B(m,k) * C(k,n) } k=-inf where here you consider B and C to be zero-extended to infinity in all four directions (up, down, left, and right). It's just that the zero element pairs do not have to be multiplied and summed. Matrix powers and exponentials (on square matrices) can be defined to be consistent with this extension of the matrix multiply. MATRIX DIVISION ("/" and "\"): Like the current matrix division, it would invert the operation of A = B*C that is C = B\A and B = A/C The same size (or shape) requirements and index origin requirements of the "*" operator would apply to "\" and "/". Given A and B, whatever shape and origin requirement for B and C and whatever shape and origin returned in A in the statement: A = B*C; would be the same requirements for B and A and define the resulting shape and origin for C in the statement: C = B\A CONCATINATION: This would also be a simple and straight-forward extension of how MATLAB presently concatinates arrays. When we say: A = [B C]; The number of rows of B and C must be equal, but the number of columns of B and C can be anything. The first columns of A are identical with the columns of B and then also must the indices of those columns. And independent of what the column indices of C are, they just pick up where the column index of B left off. This rule extension defaults to what MATLAB presently does if B and C are both arrays with origin 1. A similar rule extension can be made for A = [B ; C]; In all cases the upper left corner of A is identical to the upper left corner of B, both in value but also in subscripts (so A(1,1) becomes B(1,1) just like it does now in MATLAB). FUNCTIONS THAT RETURN INDICES (min(), max(), find(), sort(), ind2sub(), and any others that I don't know about): It must be internally consistent (and certainly can be made to be). The indices returned would be exactly like the 1-based indices returned presently in MATLAB except that the origin for the corresponding dimension (that defaults to 1) would be added to each C- like (0-based) index. That is, just like now in MATLAB: [max_value, max_index] = max(A); This must mean that A(max_index) is equal to max_value. I think that this is easy enough to define. The only hard part is to identify all MATLAB functions that search through an array and modify them to start and end at indices that might be different than 1 and size[dim] as are the search bounds today. It would instead search from origin[dim] to size[dim]+origin[dim]-1 which would default to the current operation if the origin equals 1. FOR ALL OTHER MATLAB OPERATIONS, until a reasonable extended definition for arrays with origins not 1 is thought up, MATLAB could either bomb out with an illegal operation error if the base is not 1 or could, perhaps, ignore the origin. Either way, it's still backwards compatible.
From: James Tursa on 8 Jul 2010 14:45
robert bristow-johnson <rbj(a)audioimagination.com> wrote in message <f3e6e4ef-8e06-4d8b-a4e6-392ed00594a4(a)d37g2000yqm.googlegroups.com>... > (snip) > > Steven Lord and others at TMW, can you *please* read this carefully ... > (snip) To TMW: If you guys ever *do* take the time to read through this and seriously consider doing something like this, I hope you put your ideas out ahead of time for some type of pier review to ferret out potential usage problems before implementing. e.g., Fortran had to do something similar when they went from strictly 1-based indexing to any-based indexing. James Tursa |