From: David on
"us " <us(a)neurol.unizh.ch> wrote in message <htmagu$deg$1(a)fred.mathworks.com>...
> "David "
> > Thanks for the suggestion. It has been a long time since I had used regular expressions, but a few refreshers from helpful websites, and it worked great!
> >
> > The only issues I have now are the execution time of the script, takes over 10 minutes for a 2.5 hr log file, I'd like to trim it down to 3-4 minutes (normal log files will be 8-10 hours). The Matlab Profiler is proving to be quite helpful as well!
>
> here, the important question is: how do you read the file(?)...
> - let's hope you do NOT use FGETL or a sibling...
> moreover, what size does your log-file have(?)...
>
> us

I am using fgetl, is there a better function for retrieving data line-by-line?

Typically, log files are 20MB per hour, although they may be larger in the future (as more sensors are added, or other data).

Also, interesting the Matlab profiler showed me that the largest time consumers are datenum, and my array concatenations. I am *not* preallocating space for arrays, however I did test them with preallocation, and did not find any significant increase in speed.

-Dave
From: Rune Allnor on
On 27 Mai, 19:50, "David " <robot.daveNOS...(a)gmail.com> wrote:
> "us " <u...(a)neurol.unizh.ch> wrote in message <htmagu$de...(a)fred.mathworks.com>...
> > "David "
> > > Thanks for the suggestion. It has been a long time since I had used regular expressions, but a few refreshers from helpful websites, and it worked great!
>
> > > The only issues I have now are the execution time of the script, takes over 10 minutes for a 2.5 hr log file, I'd like to trim it down to 3-4 minutes (normal log files will be 8-10 hours). The Matlab Profiler is proving to be quite helpful as well!
>
> > here, the important question is: how do you read the file(?)...
> > - let's hope you do NOT use FGETL or a sibling...
> > moreover, what size does your log-file have(?)...
>
> > us
>
> I am using fgetl, is there a better function for retrieving data line-by-line?
>
> Typically, log files are 20MB per hour, although they may be larger in the future (as more sensors are added, or other data).
>

You might want to switch to a compiled language like C++.
The problem is not that you access the file on a line-by-line
basis, but how the file IO is buffered: If the program accesses
the disk every time you access a line, it will be very slow.
If it, on the other hand, loads a couple of MBytes for each
disk access, it will be a lot faster as the lines are read from
the buffer and not the disk.

> Also, interesting the Matlab profiler showed me that the largest time consumers are datenum,

What do you use DATENUM for? Do you have to use it? Could
you get away with a simpler function, like SSCANF?

> and my array concatenations. I am *not* preallocating space for arrays, however I did test them with preallocation, and did not find any significant increase in speed.

These kinds of details become very cumbersome with matlab,
but are taken care of internally with C++.

Rune
From: dpb on
David wrote:
....

> Also, interesting the Matlab profiler showed me that the largest time
> consumers are datenum, and my array concatenations. I am *not*
> preallocating space for arrays, however I did test them with
> preallocation, and did not find any significant increase in speed.
....

Those two statements don't correlate unless the fractional percentage of
the concatenation is far down as compared to datenum (altho I suppose
there could be some issue w/ the results of the profiler or there are
several others as well as those two all of which are relatively equally
to blame). Otherwise, it would seem that cutting down on a sizable
fraction would have some noticeable benefit.

Which raises the question...what was the w/o allocation code that
indicated a time bottleneck and how was it implemented w/ preallocation?

--