Prev: How to ls -lh on a FreeBSD?
Next: shell idiom to kick off background jobs and wait for completion
From: Hongyi Zhao on 19 Oct 2009 09:10 Hi all, I want to write a script to note specific IP addresses by appending the corresponding location informations. For detail, I describe my issue as follows: Suppose I have two files, the first file is used to store the specific IP addresses which I want to note, and the second file is used to store the IP database along with the corresponding location informations. The first file has one IP address per line with dotted decimal format, e.g.: 0.125.125.125 4.19.79.28 4.36.124.150 .... The second file has four field per line delimited by CHARACTER TABULATION (U+0009). These four field are: StartIP, EndIP, Country, and Local, e.g.: StartIP EndIP Country Local 0.0.0.0 0.255.255.255 IANA CZ88.NET 4.19.79.0 4.19.79.63 American Armed Forces Radio/Television 4.36.124.128 4.36.124.255 American Technical Resource Connections Inc .... Based on the second file, I want to reformat the first file by appending the corresponding location informations for each IP address in it, i.e., for the above example, I want to obain the following result: 0.125.125.125#IANA CZ88.NET 4.19.79.28#American Armed Forces Radio/Television 4.36.124.150#American Technical Resource Connections .... Any hints on this issue will be highly appreciated. Thanks in advance. -- ..: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.
From: Ed Morton on 19 Oct 2009 12:15 Hongyi Zhao wrote: > Hi all, > > I want to write a script to note specific IP > addresses by appending the corresponding location informations. For > detail, I describe my issue as follows: > > Suppose I have two files, the first file is used to store the specific > IP > addresses which I want to note, and the second file is used to store > the IP database along with the corresponding location informations. > > The first file has one IP address per line with dotted decimal format, > e.g.: > > 0.125.125.125 > 4.19.79.28 > 4.36.124.150 > ... > > The second file has four field per line delimited by CHARACTER > TABULATION (U+0009). These four field are: StartIP, EndIP, Country, > and Local, e.g.: > > StartIP EndIP Country Local > 0.0.0.0 0.255.255.255 IANA CZ88.NET > 4.19.79.0 4.19.79.63 American Armed Forces > Radio/Television > 4.36.124.128 4.36.124.255 American Technical Resource > Connections Inc > ... > > Based on the second file, I want to reformat the first file by > appending the corresponding location informations for each IP address > in it, i.e., for the above example, I want to obain the following > result: > > 0.125.125.125#IANA CZ88.NET > 4.19.79.28#American Armed Forces Radio/Television > 4.36.124.150#American Technical Resource Connections > ... > > Any hints on this issue will be highly appreciated. > Thanks in advance. $ cat file1 0.125.125.125 4.19.79.28 4.36.124.150 $ $ cat file2 StartIP EndIP Country Local 0.0.0.0 0.255.255.255 IANA CZ88.NET 4.19.79.0 4.19.79.63 American Armed Forces Radio/Television 4.36.124.128 4.36.124.255 American Technical Resource Connections Inc $ $ cat tst.awk BEGIN{ FS="\t"; OFS="#" } function ip2nr(ip, nr,ipA) { # aaa.bbb.ccc.ddd split(ip,ipA,".") nr = ipA[1] * 1000000000 + ipA[2] * 1000000 + ipA[3] * 1000 + ipA[4] return nr } NR==FNR { addrs[$0] = ip2nr($0); next } FNR>1 { start = ip2nr($1) end = ip2nr($2) for (ip in addrs) { if (addrs[ip] >= start && addrs[ip] <= end) { print ip,$3" "$4 } } } $ $ awk -f tst.awk file1 file2 0.125.125.125#IANA CZ88.NET 4.19.79.28#American Armed Forces Radio/Television 4.36.124.150#American Technical Resource Connections Inc Regards, Ed.
From: Ed Morton on 19 Oct 2009 12:40 Ed Morton wrote: > Hongyi Zhao wrote: >> Hi all, >> >> I want to write a script to note specific IP >> addresses by appending the corresponding location informations. For >> detail, I describe my issue as follows: >> >> Suppose I have two files, the first file is used to store the specific >> IP >> addresses which I want to note, and the second file is used to store >> the IP database along with the corresponding location informations. >> >> The first file has one IP address per line with dotted decimal format, >> e.g.: >> >> 0.125.125.125 >> 4.19.79.28 >> 4.36.124.150 >> ... >> >> The second file has four field per line delimited by CHARACTER >> TABULATION (U+0009). These four field are: StartIP, EndIP, Country, >> and Local, e.g.: >> >> StartIP EndIP Country Local >> 0.0.0.0 0.255.255.255 IANA CZ88.NET >> 4.19.79.0 4.19.79.63 American Armed Forces >> Radio/Television >> 4.36.124.128 4.36.124.255 American Technical Resource >> Connections Inc >> ... >> >> Based on the second file, I want to reformat the first file by >> appending the corresponding location informations for each IP address >> in it, i.e., for the above example, I want to obain the following >> result: >> >> 0.125.125.125#IANA CZ88.NET >> 4.19.79.28#American Armed Forces Radio/Television >> 4.36.124.150#American Technical Resource Connections >> ... >> >> Any hints on this issue will be highly appreciated. >> Thanks in advance. > > $ cat file1 > 0.125.125.125 > 4.19.79.28 > 4.36.124.150 > $ > $ cat file2 > StartIP EndIP Country Local > 0.0.0.0 0.255.255.255 IANA CZ88.NET > 4.19.79.0 4.19.79.63 American Armed Forces > Radio/Television > 4.36.124.128 4.36.124.255 American Technical Resource > Connections Inc > $ > $ cat tst.awk > BEGIN{ FS="\t"; OFS="#" } > function ip2nr(ip, nr,ipA) { > # aaa.bbb.ccc.ddd > split(ip,ipA,".") > nr = ipA[1] * 1000000000 + ipA[2] * 1000000 + ipA[3] * 1000 + ipA[4] > return nr > } > NR==FNR { addrs[$0] = ip2nr($0); next } > FNR>1 { > start = ip2nr($1) > end = ip2nr($2) > for (ip in addrs) { > if (addrs[ip] >= start && addrs[ip] <= end) { > print ip,$3" "$4 > } > } > } > $ > $ awk -f tst.awk file1 file2 > 0.125.125.125#IANA CZ88.NET > 4.19.79.28#American Armed Forces Radio/Television > 4.36.124.150#American Technical Resource Connections Inc > > Regards, > > Ed. Adding a "delete" and a "next" would make the script more efficient if you have a large list of IP addresses in file1 and each range in file2 is distinct: BEGIN{ FS="\t"; OFS="#" } function ip2nr(ip, nr,ipA) { # aaa.bbb.ccc.ddd split(ip,ipA,".") nr = ipA[1] * 1000000000 + ipA[2] * 1000000 + ipA[3] * 1000 + ipA[4] return nr } NR==FNR { addrs[$0] = ip2nr($0); next } FNR>1 { start = ip2nr($1) end = ip2nr($2) for (ip in addrs) { if (addrs[ip] >= start && addrs[ip] <= end) { print ip,$3" "$4 delete addrs[ip] next } } } Ed.
From: Sidney Lambe on 19 Oct 2009 16:11 On comp.unix.shell, Ed Morton <mortonspam(a)gmail.com> wrote: > Ed Morton wrote: >> Hongyi Zhao wrote: >>> Hi all, >>> >>> I want to write a script to note specific IP >>> addresses by appending the corresponding location informations. For >>> detail, I describe my issue as follows: >>> >>> Suppose I have two files, the first file is used to store the specific >>> IP >>> addresses which I want to note, and the second file is used to store >>> the IP database along with the corresponding location informations. >>> >>> The first file has one IP address per line with dotted decimal format, >>> e.g.: >>> >>> 0.125.125.125 >>> 4.19.79.28 >>> 4.36.124.150 >>> ... >>> >>> The second file has four field per line delimited by CHARACTER >>> TABULATION (U+0009). These four field are: StartIP, EndIP, Country, >>> and Local, e.g.: >>> >>> StartIP EndIP Country Local >>> 0.0.0.0 0.255.255.255 IANA CZ88.NET >>> 4.19.79.0 4.19.79.63 American Armed Forces >>> Radio/Television >>> 4.36.124.128 4.36.124.255 American Technical Resource >>> Connections Inc >>> ... >>> >>> Based on the second file, I want to reformat the first file by >>> appending the corresponding location informations for each IP address >>> in it, i.e., for the above example, I want to obain the following >>> result: >>> >>> 0.125.125.125#IANA CZ88.NET >>> 4.19.79.28#American Armed Forces Radio/Television >>> 4.36.124.150#American Technical Resource Connections >>> ... >>> >>> Any hints on this issue will be highly appreciated. >>> Thanks in advance. >> >> $ cat file1 >> 0.125.125.125 >> 4.19.79.28 >> 4.36.124.150 >> $ >> $ cat file2 >> StartIP EndIP Country Local >> 0.0.0.0 0.255.255.255 IANA CZ88.NET >> 4.19.79.0 4.19.79.63 American Armed Forces >> Radio/Television >> 4.36.124.128 4.36.124.255 American Technical Resource >> Connections Inc >> $ >> $ cat tst.awk >> BEGIN{ FS="\t"; OFS="#" } >> function ip2nr(ip, nr,ipA) { >> # aaa.bbb.ccc.ddd >> split(ip,ipA,".") >> nr = ipA[1] * 1000000000 + ipA[2] * 1000000 + ipA[3] * 1000 + ipA[4] >> return nr >> } >> NR==FNR { addrs[$0] = ip2nr($0); next } >> FNR>1 { >> start = ip2nr($1) >> end = ip2nr($2) >> for (ip in addrs) { >> if (addrs[ip] >= start && addrs[ip] <= end) { >> print ip,$3" "$4 >> } >> } >> } >> $ >> $ awk -f tst.awk file1 file2 >> 0.125.125.125#IANA CZ88.NET >> 4.19.79.28#American Armed Forces Radio/Television >> 4.36.124.150#American Technical Resource Connections Inc >> >> Regards, >> >> Ed. > > Adding a "delete" and a "next" would make the script more efficient if > you have a large list of IP addresses in file1 and each range in file2 > is distinct: > > BEGIN{ FS="\t"; OFS="#" } > function ip2nr(ip, nr,ipA) { > # aaa.bbb.ccc.ddd > split(ip,ipA,".") > nr = ipA[1] * 1000000000 + ipA[2] * 1000000 + ipA[3] * 1000 + ipA[4] > return nr > } > NR==FNR { addrs[$0] = ip2nr($0); next } > FNR>1 { > start = ip2nr($1) > end = ip2nr($2) > for (ip in addrs) { > if (addrs[ip] >= start && addrs[ip] <= end) { > print ip,$3" "$4 > delete addrs[ip] > next > } > } > } > > Ed. Why is it that Ed Morton, who is supposed to be the Great Awk Educator, doesn't even comment his scripts, which is basic to good scripting and obviously necessary for educating people on the use of awk? Sid
From: Ed Morton on 19 Oct 2009 17:36
Sidney Lambe wrote: > On comp.unix.shell, Ed Morton <mortonspam(a)gmail.com> wrote: >> Ed Morton wrote: >>> Hongyi Zhao wrote: >>>> Hi all, >>>> >>>> I want to write a script to note specific IP >>>> addresses by appending the corresponding location informations. For >>>> detail, I describe my issue as follows: >>>> >>>> Suppose I have two files, the first file is used to store the specific >>>> IP >>>> addresses which I want to note, and the second file is used to store >>>> the IP database along with the corresponding location informations. >>>> >>>> The first file has one IP address per line with dotted decimal format, >>>> e.g.: >>>> >>>> 0.125.125.125 >>>> 4.19.79.28 >>>> 4.36.124.150 >>>> ... >>>> >>>> The second file has four field per line delimited by CHARACTER >>>> TABULATION (U+0009). These four field are: StartIP, EndIP, Country, >>>> and Local, e.g.: >>>> >>>> StartIP EndIP Country Local >>>> 0.0.0.0 0.255.255.255 IANA CZ88.NET >>>> 4.19.79.0 4.19.79.63 American Armed Forces >>>> Radio/Television >>>> 4.36.124.128 4.36.124.255 American Technical Resource >>>> Connections Inc >>>> ... >>>> >>>> Based on the second file, I want to reformat the first file by >>>> appending the corresponding location informations for each IP address >>>> in it, i.e., for the above example, I want to obain the following >>>> result: >>>> >>>> 0.125.125.125#IANA CZ88.NET >>>> 4.19.79.28#American Armed Forces Radio/Television >>>> 4.36.124.150#American Technical Resource Connections >>>> ... >>>> >>>> Any hints on this issue will be highly appreciated. >>>> Thanks in advance. >>> $ cat file1 >>> 0.125.125.125 >>> 4.19.79.28 >>> 4.36.124.150 >>> $ >>> $ cat file2 >>> StartIP EndIP Country Local >>> 0.0.0.0 0.255.255.255 IANA CZ88.NET >>> 4.19.79.0 4.19.79.63 American Armed Forces >>> Radio/Television >>> 4.36.124.128 4.36.124.255 American Technical Resource >>> Connections Inc >>> $ >>> $ cat tst.awk >>> BEGIN{ FS="\t"; OFS="#" } >>> function ip2nr(ip, nr,ipA) { >>> # aaa.bbb.ccc.ddd >>> split(ip,ipA,".") >>> nr = ipA[1] * 1000000000 + ipA[2] * 1000000 + ipA[3] * 1000 + ipA[4] >>> return nr >>> } >>> NR==FNR { addrs[$0] = ip2nr($0); next } >>> FNR>1 { >>> start = ip2nr($1) >>> end = ip2nr($2) >>> for (ip in addrs) { >>> if (addrs[ip] >= start && addrs[ip] <= end) { >>> print ip,$3" "$4 >>> } >>> } >>> } >>> $ >>> $ awk -f tst.awk file1 file2 >>> 0.125.125.125#IANA CZ88.NET >>> 4.19.79.28#American Armed Forces Radio/Television >>> 4.36.124.150#American Technical Resource Connections Inc >>> >>> Regards, >>> >>> Ed. >> Adding a "delete" and a "next" would make the script more efficient if >> you have a large list of IP addresses in file1 and each range in file2 >> is distinct: >> >> BEGIN{ FS="\t"; OFS="#" } >> function ip2nr(ip, nr,ipA) { >> # aaa.bbb.ccc.ddd >> split(ip,ipA,".") >> nr = ipA[1] * 1000000000 + ipA[2] * 1000000 + ipA[3] * 1000 + ipA[4] >> return nr >> } >> NR==FNR { addrs[$0] = ip2nr($0); next } >> FNR>1 { >> start = ip2nr($1) >> end = ip2nr($2) >> for (ip in addrs) { >> if (addrs[ip] >= start && addrs[ip] <= end) { >> print ip,$3" "$4 >> delete addrs[ip] >> next >> } >> } >> } >> >> Ed. > > > Why is it that Ed Morton, who is supposed to be the > Great Awk Educator, doesn't even comment his scripts, > which is basic to good scripting and obviously necessary > for educating people on the use of awk? > > > > Sid > Sid - what part of that code did you find confusing and I'll be happy to explain it to you. Ed. |