Prev: how to config /etc/fetchmailrc
Next: wget question
From: Dino Vliet on 3 Aug 2010 15:20 Dear debian people, Can you help me with this task I have? I have a lot of files in a subdirectory containing the following text: Correctly Classified Instances 3018117 56.6808 % Incorrectly Classified Instances 2306643 43.3192 % Kappa statistic 0.2443 Mean absolute error 0.4304 Root mean squared error 0.4586 Relative absolute error 124.1251 % Root relative squared error 110.1308 % Total Number of Instances 5324760 === Detailed Accuracy By Class === TP Rate FP Rate Precision Recall F-Measure ROC Area Class 0.618 0.343 0.681 0.618 0.648 0.697 1 0.519 0.244 0.617 0.519 0.564 0.693 2 0.296 0.141 0.056 0.296 0.094 0.66 3 === Confusion Matrix === a b c <-- classified as 1784321 684983 416649 | a = 1 787342 1190428 314537 | b = 2 49255 53877 43368 | c = 3 I need to parse this file to get in a csv file the following information: Correctly Classified Instances, Kappa statistic, Total Number of Instances, Precision {1}, Recall {1}, F-Measure {1},Precision {2}, Recall {2}, F-Measure {2},Precision {3}, Recall {3}, F-Measure {3},a,b,c,a,b,c,a,b,c 56.6808, 0.2443, 5324760, 0.681,0.618,0.648,0.617,0.519,0.564, 0.056,0.296,0.094,1784321,684983,416649,787342,1190428,314537,49255,53877,43368 Does anyone have an idea how this could be accomplished? I not that great in programming so writing a ruby or shell script do do this would take me weeks:-( Thanks Dino
From: Joao Ferreira gmail on 3 Aug 2010 15:40 On Tue, 2010-08-03 at 12:12 -0700, Dino Vliet wrote: > Does anyone have an idea how this could be accomplished? > I not that great in programming so writing a ruby or shell script do > do this would take me weeks:-( use perl !!! now seriously: use perl. don't wonder around; perl is the way you should go; use perl ! :) it's so easy ... open up a console and read these 2 perl manuals $ perldoc perlintro $ perldoc perlrequick jmf > -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/1280863933.4378.4.camel(a)debj5n.critical.pt
From: Kumar Appaiah on 3 Aug 2010 15:50 On Tue, Aug 03, 2010 at 12:12:26PM -0700, Dino Vliet wrote: > Dear debian people, > > Can you help me with this task I have? I have a lot of files in a subdirectory > containing the following text: You should use awk. - cut - > I need to parse this file to get in a csv file the following information: > > Correctly Classified Instances, Kappa statistic, Total Number of Instances, > Precision {1}, Recall {1}, F-Measure {1},Precision {2}, Recall {2}, F-Measure > {2},Precision {3}, Recall {3}, F-Measure {3},a,b,c,a,b,c,a,b,c > 56.6808, 0.2443, 5324760, 0.681,0.618,0.648,0.617,0.519,0.564, > 0.056,0.296,0.094,1784321,684983,416649,787342,1190428,314537,49255,53877,43368 > > Does anyone have an idea how this could be accomplished? > I not that great in programming so writing a ruby or shell script do do this > would take me weeks:-( A starting in Awk for processing a single file would be: BEGIN { n_equals = 0; } n_equals == 0 && /Correctly Classified/ { CCI = $(NF - 2); } n_equals == 0 && /Incorrectly Classified/ { ICI = $(NF - 2); } n_equals == 0 && /Kappa statistic/ { KS = $NF } … / ===/ { n_equals = n_equals + 1 } n_equals == 1 && /TP Rate/ { next; } // More complicated processing END { printf "%d,", CCI printf "%d,", ICI printf "%f", KS … } You ought to read the Awk manual, and then it would be a mattle of a couple of hours of thought at most. HTH. Kumar -- "Even more amazing was the realization that God has Internet access. I wonder if He has a full newsfeed?" (By Matt Welsh) -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/20100803194333.GA9899(a)bluemoon.alumni.iitm.ac.in
From: John Hasler on 3 Aug 2010 15:50 Dino writes: > Does anyone have an idea how this could be accomplished? Perl. > I not that great in programming so writing a ruby or shell script do > do this would take me weeks:-( Hire someone. -- John Hasler -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/87hbjb8plg.fsf(a)thumper.dhh.gt.org
From: Miles Fidelman on 3 Aug 2010 18:30 Kumar Appaiah wrote: > On Tue, Aug 03, 2010 at 12:12:26PM -0700, Dino Vliet wrote: > >> Dear debian people, >> >> Can you help me with this task I have? I have a lot of files in a subdirectory >> containing the following text: >> > You should use awk. > > - cut - > > You ought to read the Awk manual, and then it would be a mattle of a > couple of hours of thought at most. > you might want to start by perusing the "sed" manual - it's an even simpler tool, though it might not be powerful enough for what you're doing also take a look at: http://www.smashingmagazine.com/2009/04/10/25-text-batch-processing-tools-reviewed/ not Unix, but a collection of various visual tools for processing text in batches looks to me like your biggest problem is that each file has several sections, each in different formats, so it's not just a matter getting everything into a uniform tabular structure for import into a spreadsheet. You might want to think of this as a several step process that either: a. breaks each file into several files, each of a uniform format, then process each type of file separately, or b, c. process each file to normalize it into something that's easier to turn into csv format Or, as someone suggested - hire someone. This is the silly kind of task that's really easy if your facile with regular expressions, shell scripts, and such; but can end up taking forever to get right. Judging from the sample data, I'm guessing your at a university, there should be enough student hackers around who work cheap. Miles Fidelman -- In theory, there is no difference between theory and practice. In<fnord> practice, there is. .... Yogi Berra -- To UNSUBSCRIBE, email to debian-user-REQUEST(a)lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster(a)lists.debian.org Archive: http://lists.debian.org/4C58969D.7080302(a)meetinghouse.net
|
Pages: 1 Prev: how to config /etc/fetchmailrc Next: wget question |