From: Dino Vliet on
Dear debian people,

Can you help me with this task I have? I have a lot of files in a subdirectory containing the following text:

Correctly Classified Instances     3018117               56.6808 %
Incorrectly Classified Instances   2306643               43.3192 %
Kappa statistic                          0.2443
Mean absolute error                      0.4304
Root mean squared error                  0.4586
Relative absolute error                124.1251 %
Root relative squared error            110.1308 %
Total Number of Instances          5324760    


=== Detailed Accuracy By Class ===

TP Rate   FP Rate   Precision   Recall  F-Measure   ROC Area  Class
  0.618     0.343      0.681     0.618     0.648      0.697    1
  0.519     0.244      0.617     0.519     0.564      0.693    2
  0.296     0.141      0.056     0.296     0.094      0.66     3


=== Confusion Matrix ===

       a       b       c   <-- classified as
 1784321  684983  416649 |       a = 1
  787342 1190428  314537 |       b = 2
   49255   53877   43368 |       c = 3

I need to parse this file to get in a csv file the following information:

Correctly Classified Instances, Kappa statistic, Total Number of Instances, Precision {1}, Recall {1}, F-Measure {1},Precision {2}, Recall {2}, F-Measure {2},Precision {3}, Recall {3}, F-Measure {3},a,b,c,a,b,c,a,b,c
56.6808, 0.2443, 5324760, 0.681,0.618,0.648,0.617,0.519,0.564, 0.056,0.296,0.094,1784321,684983,416649,787342,1190428,314537,49255,53877,43368

Does anyone have an idea how this could be accomplished?
I not that great in programming so writing a ruby or shell script do do this would take me weeks:-(

Thanks
Dino

                        




From: Joao Ferreira gmail on
On Tue, 2010-08-03 at 12:12 -0700, Dino Vliet wrote:
> Does anyone have an idea how this could be accomplished?
> I not that great in programming so writing a ruby or shell script do
> do this would take me weeks:-(

use perl !!!

now seriously: use perl.

don't wonder around; perl is the way you should go; use perl !

:) it's so easy ...

open up a console and read these 2 perl manuals

$ perldoc perlintro
$ perldoc perlrequick

jmf

>



--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/1280863933.4378.4.camel(a)debj5n.critical.pt
From: Kumar Appaiah on
On Tue, Aug 03, 2010 at 12:12:26PM -0700, Dino Vliet wrote:
> Dear debian people,
>
> Can you help me with this task I have? I have a lot of files in a subdirectory
> containing the following text:

You should use awk.

- cut -

> I need to parse this file to get in a csv file the following information:
>
> Correctly Classified Instances, Kappa statistic, Total Number of Instances,
> Precision {1}, Recall {1}, F-Measure {1},Precision {2}, Recall {2}, F-Measure
> {2},Precision {3}, Recall {3}, F-Measure {3},a,b,c,a,b,c,a,b,c
> 56.6808, 0.2443, 5324760, 0.681,0.618,0.648,0.617,0.519,0.564,
> 0.056,0.296,0.094,1784321,684983,416649,787342,1190428,314537,49255,53877,43368
>
> Does anyone have an idea how this could be accomplished?
> I not that great in programming so writing a ruby or shell script do do this
> would take me weeks:-(

A starting in Awk for processing a single file would be:

BEGIN {
n_equals = 0;
}

n_equals == 0 && /Correctly Classified/ {
CCI = $(NF - 2);
}

n_equals == 0 && /Incorrectly Classified/ {
ICI = $(NF - 2);
}

n_equals == 0 && /Kappa statistic/ {
KS = $NF
}



/ ===/ { n_equals = n_equals + 1 }

n_equals == 1 && /TP Rate/ {
next;
}

// More complicated processing

END {
printf "%d,", CCI
printf "%d,", ICI
printf "%f", KS

}

You ought to read the Awk manual, and then it would be a mattle of a
couple of hours of thought at most.

HTH.

Kumar
--
"Even more amazing was the realization that God has Internet access. I
wonder if He has a full newsfeed?"
(By Matt Welsh)


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/20100803194333.GA9899(a)bluemoon.alumni.iitm.ac.in
From: John Hasler on
Dino writes:
> Does anyone have an idea how this could be accomplished?

Perl.

> I not that great in programming so writing a ruby or shell script do
> do this would take me weeks:-(

Hire someone.
--
John Hasler


--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/87hbjb8plg.fsf(a)thumper.dhh.gt.org
From: Miles Fidelman on
Kumar Appaiah wrote:
> On Tue, Aug 03, 2010 at 12:12:26PM -0700, Dino Vliet wrote:
>
>> Dear debian people,
>>
>> Can you help me with this task I have? I have a lot of files in a subdirectory
>> containing the following text:
>>
> You should use awk.
>
> - cut -
>
> You ought to read the Awk manual, and then it would be a mattle of a
> couple of hours of thought at most.
>
you might want to start by perusing the "sed" manual - it's an even
simpler tool, though it might not be powerful enough for what you're doing

also take a look at:
http://www.smashingmagazine.com/2009/04/10/25-text-batch-processing-tools-reviewed/
not Unix, but a collection of various visual tools for processing text
in batches

looks to me like your biggest problem is that each file has several
sections, each in different formats, so it's not just a matter getting
everything into a uniform tabular structure for import into a
spreadsheet. You might want to think of this as a several step process
that either:
a. breaks each file into several files, each of a uniform format, then
process each type of file separately, or b,
c. process each file to normalize it into something that's easier to
turn into csv format

Or, as someone suggested - hire someone. This is the silly kind of task
that's really easy if your facile with regular expressions, shell
scripts, and such; but can end up taking forever to get right. Judging
from the sample data, I'm guessing your at a university, there should be
enough student hackers around who work cheap.

Miles Fidelman


--
In theory, there is no difference between theory and practice.
In<fnord> practice, there is. .... Yogi Berra



--
To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org
Archive: http://lists.debian.org/4C58969D.7080302(a)meetinghouse.net
 | 
Pages: 1
Prev: how to config /etc/fetchmailrc
Next: wget question