From: James Tursa on
"Ron Klein" <ron(a)xsights.com> wrote in message <hqs57f$60g$1(a)fred.mathworks.com>...
>
> The main problem I'm facing is that I cannot just perform memcpy (memory copy) from the mxArray pointer I have to a new memory block.
> That's because there's no documentation about the *memory* representation of the mxArray.

Yes, true. You can get *some* info about it on the web, but in general TMW has not given you the proprietary details of what an mxArray struct really is.

> For *simple* data types, such as a matrix of doubles, it seems that there is a solution, using mxGetPr function.

Yes, true.

> Other data types are also supported, with the equivalent functions (for instance, mxGetPi for integers). Like this snippet:
> memcpy((void *)mxGetPr(mtlbDataArray),(void *)data,sizeof(double) * N);

You don't need the (void *) since that conversion will happen automatically by virtue of the prototype. Actually mxGetPi gets a pointer to the imaginary data part of the mxArray and is used for double class variables. mxGetData and mxGetImagData do similar functions for other classes (single, int32, etc.)

> However, if the mxArray pointer holds something more complex than a matrix, like a structure that has a class in one of its fields, and the class has a double matrix and an integer matrix... I might be wrong here about the terms, but here's the point: I know *nothing* about the memory representation of a non trivial mxArray. I found no documentation about it.

All of the access functions are there. If the object is a cell array then you can use mxGetCell to get a pointer to the mxArray element. So a loop with some recursive code will be able to get at everything. Similar comments for a struct and its fields. But it *will* be a fair amount of work to code this up.

> All I can say for sure, is that if I use matGetNextVariable I have a proper mxArray pointer. But matGetNextVariable must use a MATFile pointer as input, which is there only if I use matOpen function. And this matOpen function *must* get a string (a char*) to a file name, which should exist on the disk.
> So basically, I know how to read a simple matrix from a mat file, and store it in the memory for later usage as an mxArray. This is because I know that memcpy works in such scenarios.

I don't know what you mean by this. A variable in a mat file occupies a contiguous set of bytes in the file in a specific format that has all the size, type, data, etc info. When you read such a variable using matGetNextVariable it reads the bytes from the file and constructs an mxArray in memory to hold all of that information. But the variable in memory is *not* contiguous, as you seem to be implying. There is an mxArray structure that contains stuff like type and size info, but there are also pointers in this structure that point to the actual data. That data, in general, is *not* contiguous with the mxArray structure itself. So a simple memcpy on the mxArray structure will not work ... you have to copy the data separately with a separate memcpy.

> But I don't know how to clone/duplicate complex data structures. I don't know if the memory allocated by these structures is in one contiguous block or not, for instance.

No, it isn't per my above comment. But even the simple mxArray variables like doubles do not, in general, occupy contiguous memory blocks.

> I found no documentation about it so far.

MATLAB gives you interface functions to get at everything you need, like the type, size, and data pointers. If you have these interface functions then you don't need to know how it is stored behind the scenes or whether it is all in one contiguous block or not. Just work with the pieces you are given from the interface functions.

> As for the the post you mentioned ( http://www.mathworks.com/matlabcentral/fileexchange/26731-portable-matfile-exporter-in-c ), it describes simple data types only.

It does both a simple double example and a cell example if I recall correctly.

------------------------------------------

So the task you ask for can be done with some work. But I would like to take a step back and ask why you want to do this. Once you read a variable from a mat file into memory you have it in memory. If you want to retain it in memory for use later on why not just retain it as is? Why are you asking to copy it to a new contiguous memory block (serializing it) for retention and later extraction? What's wrong with just keeping it as an mxArray (in non-contiguous blocks) and then later using it when needed? What problem are you really trying to solve here?

James Tursa
From: Chris Hulbert on
On Apr 23, 8:53 am, "Ron Klein" <r...(a)xsights.com> wrote:
> "James Tursa" <aclassyguy_with_a_k_not_...(a)hotmail.com> wrote in message <hqqa9n$7j...(a)fred.mathworks.com>...
> > "Ron Klein" <r...(a)xsights.com> wrote in message <hqq981$k7...(a)fred.mathworks.com>...
>
> > > Bottom line: yes, I am looking for code (if such exists) in C/C++ that copies my mxArray (which is, in fact, a non-trivial data structure) into one contiguous memory block. Then I need the deserializing function, that gets a memory block and returns an mxArray (pointer).
>
> > > As for the data structure itself, I'm not sure about its inner structure. All I know, so far, is that it's not trivial, and it has a few fields.
>
> > > The detailed context: this memory block will be cached in the computer's memory. If it is stored in the memory, there's no need to access files stored in the disk, so we get a performance boost.
>
> > > Thanks for the help so far!
>
> > Rather than invent your own format, why not just use the mat file format and write the output to memory instead of a file?  The mat file format is public. I don't know of any code out there to do this generically, however. A web search might be in order. One example of this for writing a double array and a cell array to a file can be found here:
>
> >http://www.mathworks.com/matlabcentral/fileexchange/26731-portable-ma...
>
> > I imagine this technique could be modified to write to a memory block instead of a file.  The mat file format can be found here:
>
> >http://www.mathworks.com/access/helpdesk/help/pdf_doc/matlab/matfile_...
>
> > For your purposes you wouldn't want to do any of the compressed stuff, so that will make it a bit easier.
>
> > James Tursa
>
> Hi again,
>
> FOA, thanks for the help so far.
> To the issue itself:
> The main problem I'm facing is that I cannot just perform memcpy (memory copy) from the mxArray pointer I have to a new memory block.
> That's because there's no documentation about the *memory* representation of the mxArray.
> For *simple* data types, such as a matrix of doubles, it seems that there is a solution, using mxGetPr function. Other data types are also supported, with the equivalent functions (for instance, mxGetPi for integers). Like this snippet:
> memcpy((void *)mxGetPr(mtlbDataArray),(void *)data,sizeof(double) * N);


If you want to use it later as an mxArray *, why copy the data at all?
Just keep the pointer around and do not destroy the array. Perhaps
explaining more about what you want to do with it later would provide
more context?

Chris

>
> However, if the mxArray pointer holds something more complex than a matrix, like a structure that has a class in one of its fields, and the class has a double matrix and an integer matrix... I might be wrong here about the terms, but here's the point: I know *nothing* about the memory representation of a non trivial mxArray. I found no documentation about it.
> All I can say for sure, is that if I use matGetNextVariable I have a proper mxArray pointer. But matGetNextVariable must use a MATFile pointer as input, which is there only if I use matOpen function. And this matOpen function *must* get a string (a char*) to a file name, which should exist on the disk.
> So basically, I know how to read a simple matrix from a mat file, and store it in the memory for later usage as an mxArray. This is because I know that memcpy works in such scenarios.
> But I don't know how to clone/duplicate complex data structures. I don't know if the memory allocated by these structures is in one contiguous block or not, for instance. I found no documentation about it so far.
>
> As for the the post you mentioned (http://www.mathworks.com/matlabcentral/fileexchange/26731-portable-ma...), it describes simple data types only.
>
> And again, thanks for the help so far.
>
> Ron

From: James Tursa on
Chris Hulbert <cchgroupmail(a)gmail.com> wrote in message <0a57eeba-f54a-433b-9284-6fb56cee80f0(a)n6g2000vbf.googlegroups.com>...
>
> If you want to use it later as an mxArray *, why copy the data at all?
> Just keep the pointer around and do not destroy the array. Perhaps
> explaining more about what you want to do with it later would provide
> more context?

We think alike ...

James Tursa
From: EBS on
"Ron Klein" <ron(a)xsights.com> wrote in message <hqq981$k7k$1(a)fred.mathworks.com>...
> "James Tursa" <aclassyguy_with_a_k_not_a_c(a)hotmail.com> wrote in message <hqpqjb$42n$1(a)fred.mathworks.com>...
> > "Ron Klein" <ron(a)xsights.com> wrote in message <hqppkq$fct$1(a)fred.mathworks.com>...
> > > I'm using Matlab API in C++.
> > > I open a mat file (using matOpen function), and I read the first (and only variable) in it to an mxArray pointer (mxArray*).
> > > I'd like to copy the memory representation of this mxArray in order to reuse it later on, without the overhead of opening the matfile again.
> > > ============================
> > > relevant code:
> > > MATFile *pmat;
> > > const char* name=NULL;
> > > mxArray *pa;
> > >
> > > pmat = matOpen("myfile.mat", "r");
> > > pa = matGetNextVariable(pmat, &name);
> > >
> > > // how do I serialize "pa"?
> > > ============================
> > >
> > > I saw this post:
> > > http://www.mathworks.co.uk/matlabcentral/newsreader/view_thread/141797
> > > The post describes a mothod of serialization, and then deserialization, from an mxArray to "regular bytes" (unsigned chars in C/C++). However, none of the links there are available.
> > > Is there any other post/documentation for serialization and deserialization?
> > > Thanks!
> >
> > Are you simply looking for code that puts all of the parts of the mxArray (type, number of dimensions, dimensions, real data, imag data, etc.) into one contiguous block of memory? Do you need to handle cell arrays and struct arrays or just numeric arrays? Is this ultimately intended for writing to a binary file and then reading it in later on? Or what?
> >
> > James Tursa
>
> Bottom line: yes, I am looking for code (if such exists) in C/C++ that copies my mxArray (which is, in fact, a non-trivial data structure) into one contiguous memory block. Then I need the deserializing function, that gets a memory block and returns an mxArray (pointer).
>
> As for the data structure itself, I'm not sure about its inner structure. All I know, so far, is that it's not trivial, and it has a few fields.
>
> The detailed context: this memory block will be cached in the computer's memory. If it is stored in the memory, there's no need to access files stored in the disk, so we get a performance boost.
>
> Thanks for the help so far!

Check this thread (beware of line wrapping breaking the link), they talk about the undocumented 'mxSerialize' function, and another method using an undocumented SAVE stdio argument:
http://www.mathworks.com/matlabcentral/newsreader/view_thread/141797
From: Ron Klein on
"James Tursa" <aclassyguy_with_a_k_not_a_c(a)hotmail.com> wrote in message <hqsgbq$em5$1(a)fred.mathworks.com>...
> "Ron Klein" <ron(a)xsights.com> wrote in message <hqs57f$60g$1(a)fred.mathworks.com>...
> >
> > The main problem I'm facing is that I cannot just perform memcpy (memory copy) from the mxArray pointer I have to a new memory block.
> > That's because there's no documentation about the *memory* representation of the mxArray.
>
> Yes, true. You can get *some* info about it on the web, but in general TMW has not given you the proprietary details of what an mxArray struct really is.
>
> > For *simple* data types, such as a matrix of doubles, it seems that there is a solution, using mxGetPr function.
>
> Yes, true.
>
> > Other data types are also supported, with the equivalent functions (for instance, mxGetPi for integers). Like this snippet:
> > memcpy((void *)mxGetPr(mtlbDataArray),(void *)data,sizeof(double) * N);
>
> You don't need the (void *) since that conversion will happen automatically by virtue of the prototype. Actually mxGetPi gets a pointer to the imaginary data part of the mxArray and is used for double class variables. mxGetData and mxGetImagData do similar functions for other classes (single, int32, etc.)
>
> > However, if the mxArray pointer holds something more complex than a matrix, like a structure that has a class in one of its fields, and the class has a double matrix and an integer matrix... I might be wrong here about the terms, but here's the point: I know *nothing* about the memory representation of a non trivial mxArray. I found no documentation about it.
>
> All of the access functions are there. If the object is a cell array then you can use mxGetCell to get a pointer to the mxArray element. So a loop with some recursive code will be able to get at everything. Similar comments for a struct and its fields. But it *will* be a fair amount of work to code this up.
>
> > All I can say for sure, is that if I use matGetNextVariable I have a proper mxArray pointer. But matGetNextVariable must use a MATFile pointer as input, which is there only if I use matOpen function. And this matOpen function *must* get a string (a char*) to a file name, which should exist on the disk.
> > So basically, I know how to read a simple matrix from a mat file, and store it in the memory for later usage as an mxArray. This is because I know that memcpy works in such scenarios.
>
> I don't know what you mean by this. A variable in a mat file occupies a contiguous set of bytes in the file in a specific format that has all the size, type, data, etc info. When you read such a variable using matGetNextVariable it reads the bytes from the file and constructs an mxArray in memory to hold all of that information. But the variable in memory is *not* contiguous, as you seem to be implying. There is an mxArray structure that contains stuff like type and size info, but there are also pointers in this structure that point to the actual data. That data, in general, is *not* contiguous with the mxArray structure itself. So a simple memcpy on the mxArray structure will not work ... you have to copy the data separately with a separate memcpy.
>
> > But I don't know how to clone/duplicate complex data structures. I don't know if the memory allocated by these structures is in one contiguous block or not, for instance.
>
> No, it isn't per my above comment. But even the simple mxArray variables like doubles do not, in general, occupy contiguous memory blocks.
>
> > I found no documentation about it so far.
>
> MATLAB gives you interface functions to get at everything you need, like the type, size, and data pointers. If you have these interface functions then you don't need to know how it is stored behind the scenes or whether it is all in one contiguous block or not. Just work with the pieces you are given from the interface functions.
>
> > As for the the post you mentioned ( http://www.mathworks.com/matlabcentral/fileexchange/26731-portable-matfile-exporter-in-c ), it describes simple data types only.
>
> It does both a simple double example and a cell example if I recall correctly.
>
> ------------------------------------------
>
> So the task you ask for can be done with some work. But I would like to take a step back and ask why you want to do this. Once you read a variable from a mat file into memory you have it in memory. If you want to retain it in memory for use later on why not just retain it as is? Why are you asking to copy it to a new contiguous memory block (serializing it) for retention and later extraction? What's wrong with just keeping it as an mxArray (in non-contiguous blocks) and then later using it when needed? What problem are you really trying to solve here?
>
> James Tursa

Thanks again for the insights and comments. Appreciated.

I'll provide a better context for the issue, so you could understand my motivation etc.

I want to read the mat file's variable in one machine, serialize it to a byte array (unsigned char pointer in C++ context), and then send this serialized data to some other machines to do some matlab processing on it, each one with different parameters.
For instance, let's say I have a matlab function in an m file, which takes 2 parameters: my data structure and an integer, say, from 1 to 5.
Let's just say that my function's signature is something like this:
MyCoolFunc(data, num)
Now let's say that the calculation takes 20 seconds.
If I run the calculation with num=1, num=2, ..., num=5 on the same machine, one after another, I'll come up with 100 seconds of processing.
So I want to make it paralel: let's say I have 5 additional machines, named PC1 to PC5.
PC1 is configured to run the function with num as 1, PC2 with num as 2, and so on.

So far, this can be done if the mat file is single, and it doesn't change:
I can load it on PC1 to PC5, keep the mxArray pointer somewhere in my code (as suggested), and all is fine.
But it's not like that: there are many mat files, and new ones are dynamically added.
If I want PC1 to PC5 execute MyCoolFunc with a newly created mat file, I have to do the following:
Copy the mat file to each PC
Load the data structure from the disk
Execute MyCoolFunc

Loading the data from the disk is a very painful point, when it comes to performance.
This is the main reason for this post in the first place.

One might say that I can load each and every mat file in my system on PC1 to PC5, but that's huge memory allocation, and it's not feasible.

I thought I can serialize the data structure to a simple in-memory representation, like a byte array, then send it to PC1 to PC5 using a TCP connection (this is the simple part), and then deserialize the whole thing to construct a valid mxArray pointer. No disk I/O needed, the whole process involves only memory and network traffic (which can't be avoided what so ever).

So, the bottom line: I can't just keep the mxArray, I want to serialize it and reconstruct it.

Thanks all.