Prev: Script improvement
Next: object oriented shell scripts
From: AyOut on 4 Nov 2009 21:07 I have a GC log file with entries like this one: 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K (502464K), 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02 secs] I would like to parse this to output for easy plotting using gnuplot and would like the following output: 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09, 0.04, 0.02 I have tried with a command like this: awk '{if($1~/[0-9]+/ && $2=="[GC" && $3=="[PSYoungGen:")printf("%s %s %s %s %s %s\n", $1,$2,$3,$4,$5,$6)}' gc_20091104_024256_psghlc301.log | sed "s/[0-9][0-9]:.*GC \[PSYoungGen: /, /" | sed "s/K.*->/, /" | sed "s/K.*(/, /" | sed "s/K)//" but it jumps over several fields and gives me the following output: 2.7, 70850, 6800, 502464, 0.0165440 How can I set sed to not look at the last match ( "K(" ), but trigger on the first match? Thanks
From: Ed Morton on 4 Nov 2009 23:11 AyOut wrote: > I have a GC log file with entries like this one: > > 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K > (502464K), 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02 > secs] > > I would like to parse this to output for easy plotting using gnuplot > and would like the following output: > > 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09, > 0.04, 0.02 Assuming the input is all on one line: $ cat file 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K (502464K), 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02 secs] $ awk '{OFS=", "; gsub(/[^[:digit:].]/," "); $1=$1}1' file 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09, 0.04, 0.02 Ed.
From: Kaz Kylheku on 5 Nov 2009 00:39 On 2009-11-05, AyOut <morty3e(a)gmail.com> wrote: > I have a GC log file with entries like this one: > > 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K > (502464K), 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02 > secs] > > I would like to parse this to output for easy plotting using gnuplot > and would like the following output: > > 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09, > 0.04, 0.02 Kaz's txr utility to the rescue. txr -c '@(collect) @num: [GC [PSYoungGen: @{size1}K->@{size2}K(@{size3}K)] @{size4}K->@{size5}K (@{size6}K), @secs secs] [Times: user=(a)utime sys=(a)systime, real=(a)realtime secs] @(end) @(output) @(repeat) @num, @size1, @size2, @size3, @size4, @size5, @size6, @secs, @utime, @systime, @realtime @(end) @(end) ' logfile www.nongnu.org/txr
From: AyOut on 6 Nov 2009 14:39 On Nov 4, 10:11 pm, Ed Morton <mortons...(a)gmail.com> wrote: > AyOut wrote: > > I have a GC log file with entries like this one: > > > 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K > > (502464K), 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02 > > secs] > > > I would like to parse this to output for easy plotting using gnuplot > > and would like the following output: > > > 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09, > > 0.04, 0.02 > > Assuming the input is all on one line: > > $ cat file > 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K (502464K), > 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02 secs] > > $ awk '{OFS=", "; gsub(/[^[:digit:].]/," "); $1=$1}1' file > 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09, 0.04, 0..02 > > Ed. That's a beautiful solution! Now, there's a change in the log file output. The first field is now a date and time stamp 2009-11-05T15:00:16.965-0600: 0.405: [GC 2112K->750K(7680K), 0.0204170 secs] 2009-11-05T15:00:17.087-0600: 0.527: [GC 2862K->1010K(7680K), 0.0043760 secs] and applying this command cat ${gclogfile}|sed 's/^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][A- Z]:*//'|sed 's/\.[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]:*//'|awk -F: '{print (NR==1||(!$1&&$1!=p)?++c:c),$0;p=$1}' generates the following output: 15, 00, 16, 0.405, 2112, 750, 7680, 0.0204170 15, 00, 17, 0.527, 2862, 1010, 7680, 0.0043760 where the time stamp (15:00:16) shows up as 15, 00, 16. Is there a way to have the output look like this: 15:00:16, 0.405, 2112, 750, 7680, 0.0204170 15:00:17, 0.527, 2862, 1010, 7680, 0.0043760 Thanks!
From: Ed Morton on 6 Nov 2009 15:09
On Nov 6, 1:39 pm, AyOut <mort...(a)gmail.com> wrote: > On Nov 4, 10:11 pm, Ed Morton <mortons...(a)gmail.com> wrote: > > > > > > > AyOut wrote: > > > I have a GC log file with entries like this one: > > > > 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K > > > (502464K), 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02 > > > secs] > > > > I would like to parse this to output for easy plotting using gnuplot > > > and would like the following output: > > > > 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09, > > > 0.04, 0.02 > > > Assuming the input is all on one line: > > > $ cat file > > 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K (502464K), > > 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02 secs] > > > $ awk '{OFS=", "; gsub(/[^[:digit:].]/," "); $1=$1}1' file > > 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09, 0.04, 0.02 > > > Ed. > > That's a beautiful solution! Now, there's a change in the log file > output. The first field is now a date and time stamp > > 2009-11-05T15:00:16.965-0600: 0.405: [GC 2112K->750K(7680K), 0.0204170 > secs] > 2009-11-05T15:00:17.087-0600: 0.527: [GC 2862K->1010K(7680K), > 0.0043760 secs] > > and applying this command > > cat ${gclogfile}|sed 's/^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][A- > Z]:*//'|sed 's/\.[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]:*//'|awk -F: > '{print (NR==1||(!$1&&$1!=p)?++c:c),$0;p=$1}' "beautiful solution" discarded apparently! > generates the following output: > > 15, 00, 16, 0.405, 2112, 750, 7680, 0.0204170 > 15, 00, 17, 0.527, 2862, 1010, 7680, 0.0043760 > > where the time stamp (15:00:16) shows up as 15, 00, 16. Is there a > way to have the output look like this: > > 15:00:16, 0.405, 2112, 750, 7680, 0.0204170 > 15:00:17, 0.527, 2862, 1010, 7680, 0.0043760 > > Thanks!- Hide quoted text - > > - Show quoted text - Why do you keep going back to pipelines of cat, sed, and awk? If you're going to use awk anyway, you don't need sed or cat. Try this: awk '{OFS=", "; t=substr($0,12,8); $0=substr($0,30); gsub(/[[:digit:].]/," "); $1=$1; print t,$0}' file Ed. |