Prev: awk and getline
Next: awk "not contain"
From: jplee3 on 20 May 2010 17:40 On May 20, 9:50 am, Michael Paoli <michael1...(a)yahoo.com> wrote: > On May 18, 10:12am, jplee3 <jpl...(a)gmail.com> wrote: > > > On May 18, 10:00am, jplee3 <jpl...(a)gmail.com> wrote: > > > On May 17, 9:31pm, Ed Morton <mortons...(a)gmail.com> wrote: > > > > On 5/17/2010 8:05 PM, jplee3 wrote: > > > > > I was wondering if there's a way to match more than one string in a > > > > > line and then print/output the strings to the screen or a file, etc. > > > > Do you mean find a string and print that string or do you mean find a regular > > > > expression and print the string that matches it? > > > > > I'm trying to achieve something similar to "grep -o "matching_string" > > > > > file123" but with more than one string, and where each matching string > > > > > is printed subsequently in a line. > > > > > So I would want to match "matching_string1" and "matching_string2" > > > > > from one line and print both those matches on a single line. > > > > > Effectively, cutting out the text I don't want and printing out only > > > > > what I want from a line. > > > > > Is this even possible with grep or awk? > > > > Yes, but provide some small sample input and the expected output given that > > > > input so we're not guessing too much. > > > Sorry for the confusion. It is in fact a regex string that I would be > > > looking for. > > > For instance: > > > say the original line is: "<ID>=apache_server1_12345678, > > > <ID2>=blahblahblah, <ID3>=FUBAR=apache_server2_12345678| > > > IPADDRESS=192.168.1.1|MSG=hello" > > > there are hundreds of lines like these... > > > I want to extract so that I will get the following result from each > > > line (a list of IPs and hostnames): "192.168.1.1 > > > apache_server1_12345678" > > > Thanks guys! > > Sorry, the formatting is screwed up on this one: I would want the list > > to be in this format "192.168.1.1 server1" - where there's a space or > > tab delimiter (not a carriage return/newline). Also, I'd want to > > extract "server1" from "apache_server1_12345678" > > Jon LaBadie's (thanks Jon!) sed command partially worked, but there's > > no delimiter between the server name and IP. Also, I'm not 100% sure > > what the regex would look like. For the IP address, I was using > > "[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}" for "grep -o" > > but not sure if that will differ for what I'm trying to do. I'm not > > sure what the regex for extracting the server hostname would be > > either. > > sed -ne 's/^.*<ID>=apache_\([^_][^_]*\).* IPADDRESS=\([.0-9][.0-9]*\)..* > $/\2 \1/p' << \__EOT__ > <ID>=apache_server1_12345678, <ID2>=blahblahblah, > <ID3>=FUBAR=apache_server2_12345678| IPADDRESS=192.168.1.1|MSG=hello > 192.168.1.1 server1 Thanks all for the input! I've been so busy that I totally forgot mention that I figured it out after modifying Jon's solution and playing around with some regex. I'm able to extract what I need at this point, and it's very useful for extracting other data too. Thanks all for the help!
From: jplee3 on 20 May 2010 17:40
On May 20, 9:50 am, Michael Paoli <michael1...(a)yahoo.com> wrote: > On May 18, 10:12am, jplee3 <jpl...(a)gmail.com> wrote: > > > On May 18, 10:00am, jplee3 <jpl...(a)gmail.com> wrote: > > > On May 17, 9:31pm, Ed Morton <mortons...(a)gmail.com> wrote: > > > > On 5/17/2010 8:05 PM, jplee3 wrote: > > > > > I was wondering if there's a way to match more than one string in a > > > > > line and then print/output the strings to the screen or a file, etc. > > > > Do you mean find a string and print that string or do you mean find a regular > > > > expression and print the string that matches it? > > > > > I'm trying to achieve something similar to "grep -o "matching_string" > > > > > file123" but with more than one string, and where each matching string > > > > > is printed subsequently in a line. > > > > > So I would want to match "matching_string1" and "matching_string2" > > > > > from one line and print both those matches on a single line. > > > > > Effectively, cutting out the text I don't want and printing out only > > > > > what I want from a line. > > > > > Is this even possible with grep or awk? > > > > Yes, but provide some small sample input and the expected output given that > > > > input so we're not guessing too much. > > > Sorry for the confusion. It is in fact a regex string that I would be > > > looking for. > > > For instance: > > > say the original line is: "<ID>=apache_server1_12345678, > > > <ID2>=blahblahblah, <ID3>=FUBAR=apache_server2_12345678| > > > IPADDRESS=192.168.1.1|MSG=hello" > > > there are hundreds of lines like these... > > > I want to extract so that I will get the following result from each > > > line (a list of IPs and hostnames): "192.168.1.1 > > > apache_server1_12345678" > > > Thanks guys! > > Sorry, the formatting is screwed up on this one: I would want the list > > to be in this format "192.168.1.1 server1" - where there's a space or > > tab delimiter (not a carriage return/newline). Also, I'd want to > > extract "server1" from "apache_server1_12345678" > > Jon LaBadie's (thanks Jon!) sed command partially worked, but there's > > no delimiter between the server name and IP. Also, I'm not 100% sure > > what the regex would look like. For the IP address, I was using > > "[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}" for "grep -o" > > but not sure if that will differ for what I'm trying to do. I'm not > > sure what the regex for extracting the server hostname would be > > either. > > sed -ne 's/^.*<ID>=apache_\([^_][^_]*\).* IPADDRESS=\([.0-9][.0-9]*\)..* > $/\2 \1/p' << \__EOT__ > <ID>=apache_server1_12345678, <ID2>=blahblahblah, > <ID3>=FUBAR=apache_server2_12345678| IPADDRESS=192.168.1.1|MSG=hello > 192.168.1.1 server1 Thanks all for the input! I've been so busy that I totally forgot mention that I figured it out after modifying Jon's solution and playing around with some regex. I'm able to extract what I need at this point, and it's very useful for extracting other data too. Thanks all for the help! |