From: David on
I am attempting to write a script to parse some log files containing timestamped sensor data, and separate out each variable into individual arrays for each sensor.

For example:

18 16:25:54.723 - - 121,EPV,AftCurrent,0.44,A
18 16:25:54.750 - - 51,Fore Rudder,PX,96,
18 16:25:54.751 - - 63,Aft Thruster,IB,1,
18 16:25:54.756 - - 61,
18 16:25:54.759 - - 131,XSENS,Temperature,38.9,degC
18 16:25:54.759 - - 131,XSENS,AccX,-0.1,mps2,AccY,-0.1,mps2,AccZ,9.8,mps2
18 16:25:54.759 - - 131,XSENS,GyrX,-0.0,turns,GyrY,0.0,turns,GyrZ,0.0,turns

I would like to separate this data into separate arrays, each with a "time" column and "parameter", such as
xsens_temp = [16:25:54.759, 38.9; ...
16:25:55.643, 38.8; ]
and other arrays (xsens_gyrx, xsens_gyry, etc.)

I have been experimenting with textscan and sscanf with limited success, but I wonder if there is a better way to do this?
From: Rune Allnor on
On 20 Mai, 18:15, "David " <robot.daveNOS...(a)gmail.com> wrote:
> I am attempting to write a script to parse some log files containing timestamped sensor data, and separate out each variable into individual arrays for each sensor.
>
> For example:
>
> 18 16:25:54.723 -      - 121,EPV,AftCurrent,0.44,A
> 18 16:25:54.750 -      - 51,Fore Rudder,PX,96,
> 18 16:25:54.751 -      - 63,Aft Thruster,IB,1,
> 18 16:25:54.756 -      - 61,
> 18 16:25:54.759 -      - 131,XSENS,Temperature,38.9,degC
> 18 16:25:54.759 -      - 131,XSENS,AccX,-0.1,mps2,AccY,-0.1,mps2,AccZ,9.8,mps2
> 18 16:25:54.759 -      - 131,XSENS,GyrX,-0.0,turns,GyrY,0.0,turns,GyrZ,0.0,turns
>
> I would like to separate this data into separate arrays, each with a "time" column and "parameter", such as
> xsens_temp = [16:25:54.759, 38.9; ...
> 16:25:55.643, 38.8; ]
> and other arrays (xsens_gyrx, xsens_gyry, etc.)
>
> I have been experimenting with textscan and sscanf with limited success, but I wonder if there is a better way to do this?

Define 'better'.

There is a way that is very efficient, that can get the job done
with a minimum of coding, but that might run somewhat slower
than thevery fastest methods, and that requires some effort over
a period of time to learn.

The key is to use regular expressions. Find the book by Friedl to
learn how to work them (yes, you *need* that kind of book), and
write a function that goes something like this:

- The function takes a filename and a regular expression as argument
- The function scans the file on a line-by-line basis
- The function extracts the time stamps from the lines that
match the regular expression
- The function returns other data from the lines that match
the regular expression

This is the preferable way to do things but require a bit of
skills that require some time to learn, to pull off.

Rune
From: David on
Rune Allnor <allnor(a)tele.ntnu.no> wrote in message <fe4742da-3f6a-42c0-9919-fd7602dadd31(a)y12g2000vbg.googlegroups.com>...
> On 20 Mai, 18:15, "David " <robot.daveNOS...(a)gmail.com> wrote:
> > I am attempting to write a script to parse some log files containing timestamped sensor data, and separate out each variable into individual arrays for each sensor.
> >
> > For example:
> >
> > 18 16:25:54.723 -      - 121,EPV,AftCurrent,0.44,A
> > 18 16:25:54.750 -      - 51,Fore Rudder,PX,96,
> > 18 16:25:54.751 -      - 63,Aft Thruster,IB,1,
> > 18 16:25:54.756 -      - 61,
> > 18 16:25:54.759 -      - 131,XSENS,Temperature,38.9,degC
> > 18 16:25:54.759 -      - 131,XSENS,AccX,-0.1,mps2,AccY,-0.1,mps2,AccZ,9.8,mps2
> > 18 16:25:54.759 -      - 131,XSENS,GyrX,-0.0,turns,GyrY,0.0,turns,GyrZ,0.0,turns
> >
> > I would like to separate this data into separate arrays, each with a "time" column and "parameter", such as
> > xsens_temp = [16:25:54.759, 38.9; ...
> > 16:25:55.643, 38.8; ]
> > and other arrays (xsens_gyrx, xsens_gyry, etc.)
> >
> > I have been experimenting with textscan and sscanf with limited success, but I wonder if there is a better way to do this?
>
> Define 'better'.
>
> There is a way that is very efficient, that can get the job done
> with a minimum of coding, but that might run somewhat slower
> than thevery fastest methods, and that requires some effort over
> a period of time to learn.
>
> The key is to use regular expressions. Find the book by Friedl to
> learn how to work them (yes, you *need* that kind of book), and
> write a function that goes something like this:
>
> - The function takes a filename and a regular expression as argument
> - The function scans the file on a line-by-line basis
> - The function extracts the time stamps from the lines that
> match the regular expression
> - The function returns other data from the lines that match
> the regular expression
>
> This is the preferable way to do things but require a bit of
> skills that require some time to learn, to pull off.
>
> Rune

Thanks for the suggestion. It has been a long time since I had used regular expressions, but a few refreshers from helpful websites, and it worked great!

The only issues I have now are the execution time of the script, takes over 10 minutes for a 2.5 hr log file, I'd like to trim it down to 3-4 minutes (normal log files will be 8-10 hours). The Matlab Profiler is proving to be quite helpful as well!
From: Rune Allnor on
On 27 Mai, 17:54, "David " <robot.daveNOS...(a)gmail.com> wrote:
> Rune Allnor <all...(a)tele.ntnu.no> wrote in message <fe4742da-3f6a-42c0-9919-fd7602dad...(a)y12g2000vbg.googlegroups.com>...
> > On 20 Mai, 18:15, "David " <robot.daveNOS...(a)gmail.com> wrote:
> > > I am attempting to write a script to parse some log files containing timestamped sensor data, and separate out each variable into individual arrays for each sensor.
>
> > > For example:
>
> > > 18 16:25:54.723 -      - 121,EPV,AftCurrent,0.44,A
> > > 18 16:25:54.750 -      - 51,Fore Rudder,PX,96,
> > > 18 16:25:54.751 -      - 63,Aft Thruster,IB,1,
> > > 18 16:25:54.756 -      - 61,
> > > 18 16:25:54.759 -      - 131,XSENS,Temperature,38.9,degC
> > > 18 16:25:54.759 -      - 131,XSENS,AccX,-0.1,mps2,AccY,-0.1,mps2,AccZ,9.8,mps2
> > > 18 16:25:54.759 -      - 131,XSENS,GyrX,-0.0,turns,GyrY,0.0,turns,GyrZ,0.0,turns
>
> > > I would like to separate this data into separate arrays, each with a "time" column and "parameter", such as
> > > xsens_temp = [16:25:54.759, 38.9; ...
> > > 16:25:55.643, 38.8; ]
> > > and other arrays (xsens_gyrx, xsens_gyry, etc.)
>
> > > I have been experimenting with textscan and sscanf with limited success, but I wonder if there is a better way to do this?
>
> > Define 'better'.
>
> > There is a way that is very efficient, that can get the job done
> > with a minimum of coding, but that might run somewhat slower
> > than thevery fastest methods, and that requires some effort over
> > a period of time to learn.
>
> > The key is to use regular expressions. Find the book by Friedl to
> > learn how to work them (yes, you *need* that kind of book), and
> > write a function that goes something like this:
>
> > - The function takes a filename and a regular expression as argument
> > - The function scans the file on a line-by-line basis
> > - The function extracts the time stamps from the lines that
> >   match the regular expression
> > - The function returns other data from the lines that match
> >   the regular expression
>
> > This is the preferable way to do things but require a bit of
> > skills that require some time to learn, to pull off.
>
> > Rune
>
> Thanks for the suggestion. It has been a long time since I had used regular expressions, but a few refreshers from helpful websites, and it worked great!
>
> The only issues I have now are the execution time of the script, takes over 10 minutes for a 2.5 hr log file, I'd like to trim it down to 3-4 minutes (normal log files will be 8-10 hours). The Matlab Profiler is proving to be quite helpful as well!

Use a faster parser than matlab.

As I understand it, there is a perl parser included with matlab.
You might find that it speeds things up a a bit.

Rune
From: us on
"David "
> Thanks for the suggestion. It has been a long time since I had used regular expressions, but a few refreshers from helpful websites, and it worked great!
>
> The only issues I have now are the execution time of the script, takes over 10 minutes for a 2.5 hr log file, I'd like to trim it down to 3-4 minutes (normal log files will be 8-10 hours). The Matlab Profiler is proving to be quite helpful as well!

here, the important question is: how do you read the file(?)...
- let's hope you do NOT use FGETL or a sibling...
moreover, what size does your log-file have(?)...

us